Difference between revisions of "HoudiniTops"
|Line 30:||Line 30:|
Well, my overview anyway.
Well, my overview anyway.
=== Tops vs Rops ===
=== Tops vs Rops ===
Tops is a node graph, built in python, to do stuff. It's similar to Rops in that it lets you chain operations together and control execution order. Rops only lets you see what's going on at a high level, eg 'run mantra' or 'do a sim', while Tops is like going from /obj down into sops; you can see the inner workings of the stuff you're running.
Tops is a node graph, built in python, to do stuff. It's similar to Rops in that it lets you chain operations together and control execution order. Rops only lets you see what's going on at a high level, eg 'run mantra' or 'do a sim', while Tops is like going from /obj down into sops ; you can see the inner workings of the stuff you're running.
=== Workitems vs points ===
=== Workitems vs points ===
Revision as of 17:53, 7 December 2020
Simple cache then render locally
Download hip: File:pdg_basic_v01.hip
Most existing Houdini users want the basics from PDG; cache a sim to disk, run a render. Maybe chaser mode as a bonus? FFmpeg the result into an mp4, why not eh, YOLO!
Here's that setup. Click the triangle with the orange thingy on it to start.
cache sim is a fetch top that points to a disk cache sop after a simulation. You DON'T want a sim cache running on multiple threads/machines, it should just be one job that runs sequentually. To do this enable 'all frames in one batch'.
map by index controls execution order and limits how jobs are created. If you have node A that generates 10 things, connected to node B that is also set to generate 10 things, PDG's default behavior is to generate 10 B things for each thing made by A. In other words, you'll get 10 x 10 = 100 total tasks. For situations like this, that's definitely not what you want.
The mapbyindex ensures tasks are linked together, so 1 frame of the cache is linked to 1 frame of the render. Further, it allows a 'chaser' mode, in that as soon as frame 1 of the sim cache is done, frame 1 of the mantra render can start, frame 2 of the sim cache is done, frame 2 of the mantra render can start etc.
mantra is another fetch top that points to a mantra rop.
waitforall does as implied, it won't let downstream nodes start until all the upstream nodes are completed. It also subtly adjust the flow of tasks; the previous nodes have 48 dots representing the individual frames, this node has a single rectangle, implying its now treating the frame sequence as a single unit.
ffmpeg top needs some explaining (and some adjustments to the fetch top that calls the mantra rop), which I explain below.
Note that the frameranges on the fetch tops override the ranges set on their target rops by default.
Also note that the button with the orange thingy on it kicks off the output, looking for the matching node with the orange output flag. See in that screenshot how I've left it on the mantra node? That means it'll never run the ffmpeg task. I'm an idiot.
Well, my overview anyway.
Tops vs Rops vs sops
Tops is a node graph, built in python, to do stuff. It's similar to Rops in that it lets you chain operations together and control execution order. Rops only lets you see what's going on at a high level, eg 'run mantra' or 'do a sim', while Tops is like going from /obj down into sops and playing with points; you can see the inner workings of the stuff you're running.
Workitems vs points
If going from /obj to sops lets you work with points, going from rops to tops lets you play with workitems, the atomic unit of tops. At its simplest you can think of a workitem as a frame, so if you have a mantra top that renders frames 1 to 100, then when you run the graph you'll see 100 workitems. Tops visualises those as little dots on each top node.
Workitems vs packed prims
In a similar way that a point in sops can represent an actual point, or a packed primitive, or a RBD object, or whatever else, a workitem doesn't have to be just a rendered image. It could be a frame of a cache, or each node could represent a full image sequence on disk. Work items themselves can be collated together into groups called partitions, and expanded out again later into workitems (like packing and unpacking geometry). All powerful, but new concepts and workflows to deal with, hard to grasp at first.
Generating vs cooking
Watch some of the masterclasses about sops, and you'll hear sidefx folk talk about dirtying nodes, traversing bottom up through the graph setting states, then working top down cooking nodes as it goes. You could think of that first process of traversing bottom-up as generating a list of work to do, and then the working top-down as actually cooking that list of work.
Tops exposes that generating vs cooking to the user. For simple graphs you don't have to worry about it, tops will generate and cook in one hit for you, but it's good to know the difference when you get stuck on more complex graphs.
The reasoning is you might have some nodes that take hours to calculate, but you don't always need to execute those nodes to know ahead of time how many frames (workitems) it could generate. In fact you might be able to do quite a lot of work designing your graph without ever having to execute any nodes, sort of like keeping Houdini in manual/nocook mode.
You can right click on a node and choose 'generate'. If the node is smart enough, it will generate workitems for you, which you can then cook. Normally you just generate and cook at the same time, with the keyboard shortcut shift-v.
Inputs and outputs
If tops nodes are to be chained together, they need to somehow pass information betweeen them. Similar to sops and point attributes, tops uses workitem attributes. While points must have @ptnum and @P, workitems usually have an index, input and output defined. Broadly speaking, input is what the node will work on, and output is where it will write the result to.
Eg, you have a filepattern top and a ffmpegextractimages top linked together. The filepattern top creates workitems from places on disk, so if you pointed it at $HIP/myvideos/*.mp4, when you cook that node and inspect a single workitem, there's no input attributes (it has no parent node giving it stuff), while output will be $HIP/myvideos/funnycat01.mp4.
Moving to the ffmpeg node, if you r.click on it, choose 'generate', then inspect a workitem, you'll see that input is set to $HIP/myvideos/funnycat01.mp4. Ie, this is the input the node will use to do stuff.
There's no output yet, because the node hasn't cooked. Cook it, inspect the workitem again, there's now an output attribute, which contains an array of the image sequence generated by ffmpeg.
Append another node, say a generic generator (kind of like a fancy null), and generate+cook it, select a workitem, now you can see that the input on this node is the same as the output from the previous node.
Took me a while to get used to this, I couldn't follow the relationship between inputs and outputs. In hindsight it makes sense; outputs from the previous node are copied to inputs for the next node. What this node does to the outputs is up to the node! It might copy the input to the output untouched (say like this generator 'null', or a node that is just creating other attributes), or it could generate completely new outputs (say converting image sequences back into mp4's).
What if you need to get to those attributes in code? What if you want to define your own attributes? What if you want to use those attributes outside of tops?
The input and output are exposed as @pdg_input and @pdg_output. A lot of work in tops is done using hscript expressions on parameters, so most of the time you have to escape them in backticks.
Pdg attributes are available to the rest of houdini, and can be used where you'd do things like $HIP, $OS, $F, $T etc. When the tops graph is run, and that particular workitem within the tops node is being processed, those pdg attributes will be set and can be used. Eg, you use a Fetch top to run a rop somewhere else in houdini, say a cops graph. You could set the file cop at the top of your compositing network to use `@pdg_input` as the image source, and it will then be replaced with whatever image sequence you choose to find (or generate) in tops.
Short version: If things are acting weird, set the 'generate when' option to dynamic, you'll get a little purple icon to say it's now dynamic, stuff should work.
Long version: Tops makes a distinction between 'generate' and 'cook' steps for nodes. Ideally you would know ahead of time what each node is going to produce, so you get a sense of how many images/sim caches/mp4's you're going to generate before the work is done. But sometimes you can't know that in advance, and you have to adjust workflow accordingly.
Eg, say you're using a Fetch Top to run a mantra render. You don't need to execute the render to find out how many frames you'll generate, its right there in the parameters on the mantra rop (or better yet on the fetch top itself). If you r.click the node and go 'generate', it'll populate with as many dots as you have frames in the render. Easy.
Now say you have a ffmpegextractimages top, and you want to use an attribute create top to set @framecount, the number of frames that were extracted. Tops cannot know this number until ffmpeg actually runs.
The 'generate when' mode at the top of every node tells the node when to do its workitem calculation (the generate step). The default is 'automatic', tops will try and guess if nodes should wait for previous nodes to cook or not. But sometimes tops guesses wrong, and things will silently misbehave. In this case with the framecount setup, I had to set the mode to 'each upstream item is cooked'. So now when I'm processing many videos, only when each video has been extracted will the attribute create node start, and be able to create the right result.
@pdg_input is blank
In summary, check the nodes above the erroring one, especially if they have an unchecked copy inputs to outputs toggle.
I've had a few occasions where I'll have a node error, go inspect a workitem and see that where I'm expecting to find a value for `@pdg_input` in a parameter, its actually an empty string.
The culprit is always a previous node. Related to what I mentioned before, nodes usually expect an input, and most of the times set an output. What can happen if you're not careful is some top nodes may not set an output, or don't do the implicit 'copy input to output' if they don't need to. The generic generator node, which I use occasionally as null to just have a place to see whats flowing through, is an example of this.
Set a limit on the number of tasks
Say you have a folder full of images that you want to process, but for testing just want the first 5 images.
A filterbyrange top will let you do this.
Get framecount from ffmpeg extract images
The ffmpeg extract images node usually generates a output attribute which is an array of the images its created. At some point this stopped working for me, so I had to find another way to count the images.
After talking to support it was due to putting quotes around the output parameter path. Doh. Still, leaving this here as it's still a good mini example of tops workflows.
So as I said, we could try and find the files in the ffmpeg output folder for the node. A filePattern top can do this. If the ffmpegextractimages node is using the default
The following filepattern node should point to the folder above it, ie
It also has 'split files into seperate items' turned OFF, so that way I just have a single workitem per image sequence.
If you try this now, it won't generate any workitems. The default generate mode will mean it tries to look in that folder straight away, finds no images, and as such returns no workitems. Change the generate mode to 'each upstream item is cooked', then it works as expected.
Ok great, but where's the actual number of frames? It's there, annoyingly hidden. There's some tops attributes that don't appear in the middle click info, one of those is @pdg_outputsize. In this case, unsurprisingly, it returns the amount of frames in the sequence. So with an attribute create node, you can create a new integer variable called framecount, and set an expression to use @pdg_outputsize.
Note that you don't need to change the generate mode on the attribute create. As soon as any upstream node is set to be dynamic (ie, it has to wait for previous items to cook), all subsequent nodes are also made dynamic.
Create attribute from string array element
Related to the previous example, I used an attribute from string node to split a directory path into components, and then wanted to create a new attribute based on the last part of that path. Annoyingly I couldn't work out how to get an array element from the standard attribute create node, so I gave up and used a python script node instead:
renderpass = work_item.attrib('split')[-1] work_item.setStringAttrib("renderpass", renderpass)
Had another go, here's a purely node based approach:
The tops method for getting array elements is @attribute.index, eg @split.0. Here I wanted the last element, but there's no python style -1 syntax, so instead I create the array, reverse it, read the 0th element. Specifically:
- attribute from string top, split by delimiter enabled, '/' as the delimeter
- attribute array top, update existing 'split' attribute, reversed enabled
- attribute create top, uses `@split.0`
Ffmpeg and non sidefx rops
I had a few issues getting ffmpeg to make mp4's from a renderman rop. In the end the fixes were relatively straightfoward.
The ffmpeg top needs to know a few things:
- what part of the upstream nodes are making images, set with output file tag
- what the location of those images are, set with output parm name
- that the images are bundled into a single unit of work, using a waitforall top.
Top nodes can tag their output(s), in this case the ffmpeg top expects the images to have a 'file/image' tag. On the fetch top for renderman rop enable 'output file tag' and use the dropdown to select 'file/image'
To know what file name to put in that tag, enable 'output parm name' and set it to 'ri_display_0'. This is the parameter on the ris rop where the image path is set.
To bundle all the frames into a single unit, use a waitforall top.
A last specific thing for our setup, our build of ffmpeg didn't understand the '-apply_trc' option, so I disabled it.
Force python scripts to run on the farm
If you have a python script node, even if you have a tractor or deadline scheduler, it will run in your local houdini session by default.
To fix this, turn off 'Evaluate in process'.
Ensure a rop geometry top sim runs on a single blade
You don't want 240 machines all doing their own run up, that's silly. Go to the 'rop fetch' tab, enable 'all frames in one batch', that'll lock it to a single blade and run sequentially.
Tractor scheduler stuck
too often less often after some fixes from sidefx. Tricks to unstick it in order of least to most annoying:
- Make sure there's no active stuck jobs of yours on the farm, delete 'em all and try again
- R.click on tractor scheduler, 'delete temp directory'
- Select the tractor scheduler, ctrl-x to cut it, ctrl-v to paste it
- Reload the hip
- Restart houdini
- Quit FX
- Quit the industry
Rez and tops debugging
Running this in a python script top to see whats going on with rez and environment values:
print('debug info...') key = 'REZ_RESOLVE' print(key+'='+os.environ[key]) print ('') import pdg has_tractor = str(pdg.types.schedulers.has_tractor) print('pdg.types.schedulers.has_tractor: ' + has_tractor) print ('')
Mostly works, the long story can be found below, but here's the summary:
- Your environment needs access to the python tractor api. If you use rez, make sure to bring in a package for tractor.
- PDG assumes it'll find $PYTHON set correctly. We didn't have this, but even then I found I couldn't use the regular system python, but had to point it to hython ( $HFS/bin/hython )
- If your farm is behind a firewall, make sure your IT department chooses 2 ports you can use, and enter those ports into the callback and relay port fields on the tractor scheduler
- As of 18.0.502 retry support exists on the tractor scheduler, as well as options for better logging.
- Cooking jobs by default expects to connect to your desktop machine to update information, give you blinky lights and dots. This means that if you close your houdini session, the job will stop working on the farm. Call me old fashioned, but that defeats most of the point of using a farm. If you don't want this, use the 'submit graph as job' option at the top of the tractor scheduler, and it will run independent of your GUI session. Getting these to work reliably was problematic for us, YMMV.
Tops and tractor diary
Moving the diary to TopsTractorDiary.