Difference between revisions of "HoudiniTops"
|Line 26:||Line 26:|
= Misc bits =
= Misc bits =
=== Force python scripts to run on the farm ===
=== Force python scripts to run on the farm ===
Revision as of 16:14, 2 December 2020
Simple cache then render locally
Download hip: File:pdg_basic_v01.hip
Most existing Houdini users want the basics from PDG; cache a sim to disk, run a render. Maybe chaser mode as a bonus? FFmpeg the result into an mp4, why not eh, YOLO!
Here's that setup. Click the triangle with the orange thingy on it to start.
cache sim is a fetch top that points to a disk cache sop after a simulation. You DON'T want a sim cache running on multiple threads/machines, it should just be one job that runs sequentually. To do this enable 'all frames in one batch'.
map by index controls execution order and limits how jobs are created. If you have node A that generates 10 things, connected to node B that is also set to generate 10 things, PDG's default behavior is to generate 10 B things for each thing made by A. In other words, you'll get 10 x 10 = 100 total tasks. For situations like this, that's definitely not what you want.
The mapbyindex ensures tasks are linked together, so 1 frame of the cache is linked to 1 frame of the render. Further, it allows a 'chaser' mode, in that as soon as frame 1 of the sim cache is done, frame 1 of the mantra render can start, frame 2 of the sim cache is done, frame 2 of the mantra render can start etc.
mantra is another fetch top that points to a mantra rop.
waitforall does as implied, it won't let downstream nodes start until all the upstream nodes are completed. It also subtly adjust the flow of tasks; the previous nodes have 48 dots representing the individual frames, this node has a single rectangle, implying its now treating the frame sequence as a single unit.
ffmpeg top needs some explaining (and some adjustments to the fetch top that calls the mantra rop), which I explain below.
Note that the frameranges on the fetch tops override the ranges set on their target rops by default.
Also note that the button with the orange thingy on it kicks off the output, looking for the matching node with the orange output flag. See in that screenshot how I've left it on the mantra node? That means it'll never run the ffmpeg task. I'm an idiot.
Short version: If things are acting weird, set the 'generate when' option to dynamic, you'll get a little purple icon to say its now dynamic, stuff should work.
Long version: Tops makes a distinction between 'generate' and 'cook' for nodes. The core idea being you can sometimes know stuff about your nodes before they do work, but other times you might wait for the nodes to finish. For example a tops node to run a mantra rop; it can look at the first and last frame parameters, and know exactly how many workitems to create without actually running the mantra job.
Compare that to, say, a node that runs an external remeshing node or some wacky database call; it might return 1 thing, it might return 200 things, it can't know until the node is actually executed.
On top of that, there's subtle differences to when you're inspecting your tops graph as a human, clicking on little dots, looking at this vs that, vs what happens under the hood when a tops graph is executed. By the time you have little dots to click on, the graph has been generated, its all been cooked. But when each node is actually in the process of cooking, things that you as a user can see, might not be available to the nodes at the time of being cooked.
What the hell does all that mean?
It means you might create a tops variable, say @renderpass, which is made from looking at a filepath and splitting the last folder name off. But when you try and use that with an ffmpeg top, it fails, when you inspect the logs for the node, the output path looks malformed, because where @renderpass was used, its being evaluated as an empty string.
This is because at the time the ffmpeg node runs, @renderpass doesn't exist yet, so its a blank string. Its only after the complete tops graph is cooked that you, as a human, can see it exists, and you shout at ffmpeg in frustration.
The 'generate when' at the top of every node is to help tops understand if it can generate its required attributes and workitems up front quickly, or if it has to wait for previous nodes to cook before it can do work. Here, if I set the generate when mode to 'each upstream item is cooked', then ffmpeg waits until the previous workitem has been fully cooked, the @renderpass attribute should definitely exist, meaning the output path should look as expected.
The default 'automatic' mode usually does the right thing, but if you get odd results, thats the first thing to try.
Force python scripts to run on the farm
If you have a python script node, even if you have a tractor or deadline scheduler, it will run in your local houdini session by default.
To fix this, turn off 'Evaluate in process'.
Ensure a rop geometry top sim runs on a single blade
You don't want 240 machines all doing their own run up, that's silly. Go to the 'rop fetch' tab, enable 'all frames in one batch', that'll lock it to a single blade and run sequentially.
Tractor scheduler stuck
too often less often after some fixes from sidefx. Tricks to unstick it in order of least to most annoying:
- Make sure there's no active stuck jobs of yours on the farm, delete 'em all and try again
- R.click on tractor scheduler, 'delete temp directory'
- Select the tractor scheduler, ctrl-x to cut it, ctrl-v to paste it
- Reload the hip
- Restart houdini
- Quit FX
- Quit the industry
Rez and tops debugging
Running this in a python script top to see whats going on with rez and environment values:
print('debug info...') key = 'REZ_RESOLVE' print(key+'='+os.environ[key]) print ('') import pdg has_tractor = str(pdg.types.schedulers.has_tractor) print('pdg.types.schedulers.has_tractor: ' + has_tractor) print ('')
Ffmpeg and non sidefx rops
I had a few issues getting ffmpeg to make mp4's from a renderman rop. In the end the fixes were relatively straightfoward.
The ffmpeg top needs to know a few things:
- what part of the upstream nodes are making images, set with output file tag
- what the location of those images are, set with output parm name
- that the images are bundled into a single unit of work, using a waitforall top.
Top nodes can tag their output(s), in this case the ffmpeg top expects the images to have a 'file/image' tag. On the fetch top for renderman rop enable 'output file tag' and use the dropdown to select 'file/image'
To know what file name to put in that tag, enable 'output parm name' and set it to 'ri_display_0'. This is the parameter on the ris rop where the image path is set.
To bundle all the frames into a single unit, use a waitforall top.
A last specific thing for our setup, our build of ffmpeg didn't understand the '-apply_trc' option, so I disabled it.
Set a limit on the number of jobs
Say you have a folder full of images that you want to process, but for testing just want the first 5 images.
A filterbyrange top will let you do this.
Mostly works, the long story can be found below, but here's the summary:
- Your environment needs access to the python tractor api. If you use rez, make sure to bring in a package for tractor.
- PDG assumes it'll find $PYTHON set correctly. We didn't have this, but even then I found I couldn't use the regular system python, but had to point it to hython ( $HFS/bin/hython )
- If your farm is behind a firewall, make sure your IT department chooses 2 ports you can use, and enter those ports into the callback and relay port fields on the tractor scheduler
- As of 18.0.502 retry support exists on the tractor scheduler, as well as options for better logging.
- Cooking jobs by default expects to connect to your desktop machine to update information, give you blinky lights and dots. This means that if you close your houdini session, the job will stop working on the farm. Call me old fashioned, but that defeats most of the point of using a farm. If you don't want this, use the 'submit graph as job' option at the top of the tractor scheduler, and it will run independent of your GUI session. Getting these to work reliably was problematic for us, YMMV.
Tops and tractor diary
Moving the diary to TopsTractorDiary.