Open Source Publishing Development

Hi,

Me and a fellow pipeline developer just started developing an open source publishing tool, primarily for Maya, but with intent on platform, software and language agnosticism.

The goal is to better understand the needs of publishing at both feature film, games, and commercial productions and to explore the challenges, benefits and disadvantages of running a tool such as this external to any software; as opposed to utilising the internal libraries shipping with e.g. Maya such as PySide.

I’ve started a community-driven Wiki on GitHub for the project, along with some rudimentary introduction to the topic and architecture of the application we’ll be developing - the goal of which is to expand as development progresses. The “Issues” section of GitHub will be used as a Todo-list, to assign tasks, keep track of who does what along with of course tracking issues once issues start to arise.

Would there be any interest on this forum in either following or contributing to such a project?

Let us know here!

Im guessing this is part of the Pipi project development?

This is standalone and should be directly applicable in various productions once complete, without any other dependencies. I’ll be looking to this project for guidance in implementing publishing for Pipi, as I’d like others to do too for their publishing efforts, but other than that there is no relation.

Sorry for the long post! :slight_smile: Not sure where you’ll stop reading because of the length of it, but remember even though I have some criticism I really like the idea of creating tools like this open-source and driven by community.

The project sounds interesting but I think the boundaries for now seem rather undefined, or at least the exact use case.
This makes it hard for me to jump right in with ecstatic excitement.

As of now the description makes it float somewhere between checking for errors (selection, validation, extraction, conform) and asset management.
You try to define rules of where files come from and where they should go which is the core of asset management, because this results in a form of file-link-relationships.
On the other end you’re trying to define rules for how the selecting, validating, extracting and conforming procedure gets processed.

I think the core of what you’re trying (or want) to create would allow people to easily set up checks before publishing and maintain a consistent output of data.
For this I would assume their are two important separated steps:

  1. Check the relevant assets for any errors (and possibly raise warning/errors and fix it for the user)
  2. The actual publishing (the asset management part)

Since these are two very separated elements (in its core) I would likely separate them as independent libraries:

  • One would provide the asset management (file path relationship, what goes from where to where based on user configurations)
  • One would provide the processing of selections, validations, extractions and conforms to a given output.

Asset Management
There’s been many discussions about asset management and their are tools out there that’re already trying to help with it. (Think shotgun, ftrack, pipi?)
It would definitely be interesting to see a good open source solution that offers a solid foundation for asset management in a CG production!
I know that the awesome guys from the Blender Foundation are about to start up exactly this in an open-source way.
If this is what you’re aiming for I would recommend teaming up with them, because they will likely be able to get a lot of people involved with it!
Here’s some more information: http://gooseberry.blender.org/call-for-feedback-redcurrant-high-level-requirements/

I definitely got excited when I saw they were setting this up.

Selection, Validation, Extraction and Conforming
This other aspect is something that is so very studio/pipeline specific. I think many studios will be performing other checks on their scene!
Nevertheless this is a great aspect that could benefit from a light-weight core that allows users to process data, validate it and output it.

I developed the idea for a tool once (and prototyped it with an intern at the time) that we called Sir Check-a-lot.
Basically it processed a list of things to check for and outputted any errors it had (with possible auto-fixes for invalid cases).
It was pretty close to Aardmans’ Quality GuAard in terms of core concept. (Look it up on this page: http://www.martintomoya.com/tools_info.html)

Basically we set up the core for how ‘checking/validation/fixing’ would work which was totally software agnostic.
Checks would be able to be dependent on each other (like nested checks, if the first doesn’t pass no need to pass the second).
In practice I could use the application stand-alone to check an online website every 10 minutes to see if it’s still online (and if not perform an autofix if possible), but we could also really easily set up a bunch of custom checks for Maya to be used pre-publishing a model.
All we needed to-do was add ‘Check plug-ins’ that perform the checking functionality (like checking UVs and returning whether it was autofixable).
It’s an extremely interesting (and useful) prototype which I never got around to finishing towards a solid tool in our pipeline, but I really should… because something like this could be invaluable for consistency in workflow and output.
Also because it’s basically a processing queue with possible relationships between each other the final thing in the queue could easily be the publishing ‘Check’ that actually outputs the file only if all checks were succesful!

The ‘check’ itself
The implementation of each “Check” will not be software and language agnostic, unless you provide a wrapper around all applications you want to support.
Primarily the issue with selection, validation and extraction is that it always will be very application specific.
I assume you would want to check UVs, invalid NGons, naming conventions, etc. in many cases, which would really require very specific code to be written for each application.
You won’t be able to provide a wrapper around all useful use cases so this shouldn’t be in the core of the tool.
Instead of focusing on delivering the validations you would provide a solution that easily allows TDs to implement their own validations to the system.

Hey BigRoyNL,

Thanks for your post, some very valid points in there.

To summarise the below:

  • Asset Management = where and how files end up on a file-system/metadata ends up in a database
  • Publishing = quality assurance

I can see how a standalone tool such as this might seem to fit right in the middle of a bunch of other tools in a pipeline, it certainly does. About boundaries; you’re right about there being a blurry line between getting data out of software and asset management. I’ll have to try and properly define these, but for the time being I’d classify the separation like this; From the point at where a user initiates a publish and to where data actually leaves the software and hits a temporary spot on the local file-system - this is where the responsibilities of Publish ends and out-of-band pipeline processes kick in.

Within this boundary, there are plenty of concerns to consider that have very little to do with asset management; such as the user interface, how to write validations and how to have them run upon publishing.

At the end of this boundary, Publish could potentially provide hooks for additional processes, such as logging into databases and actually placing the resulting output into a consistent spot on a file-server, but would ship defaults to satisfy the simplest of asset management needs.

There’s been many discussions about asset management and their are tools out there that’re already trying to help with it.

Agreed, and I’d like to think I’m in tune with what it means to manage assets within a production, but I think we’re getting ahead of ourselves. I don’t think publishing need to be an involved process; although it certainly could be and has been wherever I’ve seen it thus far. Asset management would certainly be difficult without publishing, but consider whether or not publishing would be equally difficult without asset management.

As for responsibilities of Publish (this app), it should facilitate at least one method of achieving publishing at a studio. I’ve spent time at quite a few studios who have never heard of the concept of publishing before; I think this is quite a shame. It’s a tricky topic to isolate, as you pointed out, and thus debate. But I think it could be very useful on its own in a whole lot of situations.

I’ll try and more clearly define the boundaries. Thanks for your input!

The ‘check’ itself
The implementation of each “Check” will not be software and language agnostic, unless you provide a wrapper around all applications you want to support.

Agreed, these would all have to be very specific to each software running Publish.

Edit: For clarity, “check” == “validation”

Asset Management
I know that the awesome guys from the Blender Foundation are about to start up exactly this in an open-source way.
If this is what you’re aiming for I would recommend teaming up with them, because they will likely be able to get a lot of people involved with it!

My understanding of their initiative is to mimic Shotgun/FTrack in its task-tracking and scheduling functionality, as opposed to Pipeline Toolkit/Connect; I’d have a look at Stalker for that. Task-tracking, however, doesn’t involve quality assurance.

Could you have a look here and let me know if anything is unclear?

In the four stages, Selection, Validation, Extraction and Conform; Conform is breaking this boundary and touches on asset management. I tried removing this process; let me know if this makes sense:

Here, the responsibility of Publish ends once the data has been exported out of e.g. Maya and onto a publicly accessible location. A signal is emitted (either message/callback etc.) that external pipeline tools can listen on; the signal carries an absolute path to where other tools can continue processing the data, such as copying it to a location for other publishes and so on.

We’re thinking about how to make storing metadata configurable enough to fit others, if anyone has any objections or ideas, now would be an excellent time to chip in.

Has anyone had any experience with shot publishing? I take it the expectations of it are quite different from publishing assets in general. What steps would you expect from preparing a shot for publishing? This is how I’m envisioning it:

  1. Artist imports a bunch of assets
  2. Artist hits “publish”
  3. Each asset’s corresponding output is exported; e.g. alembic for deforming geo, atom for curves and json for a description per constraint used.

How do you enlighten your publishing tool of this output?

I understand the goals of this project correctly, you are aiming for a software agnostic validation tool that can sit between the DCC and any tracking software?

That would be one way of using it, yes. It should also be useful on its own, for those who aren’t use tracking software. (assuming you’re referring to Shotgun et. al.)

The tricky thing with Publishing is that you create a form of output that should be easily read and magically worked with somewhere else.
“Publish” can create a multitude of output files, but if there’s no way to consistently create the next work-file it doesn’t reduce that much more hassle; just more neatly saved files.

Consider this Animation -> Lighting workflow in Maya:

We have an animated scene with a bunch of rigged/animated characters, tons of V-ray proxies (that we had to constraint to an animated surface something the character is pushing around) and some other hi-poly meshes.
Then we also have a particle system with some instanced objects using Maya’s instancer node.

As an output I would need to have at least the following assets to create the lighting file:
[ol]
[li]Alembic (containing animated meshes + hi-poly meshes; EXCLUDE the shapes of the vrayProxies!)[/li][li]Shader-relations (Alembic publishes without shaders for an asset, but it should contain an ‘asset ID’ so in the lighting file we can identify what source asset it was and apply the shader.[/li][li]Baked (possibly animated) V-ray proxies. This can’t be included in an Alembic, and again this should have some form of 'asset ID’s to apply shaders to it in a later stage.[/li][li]Instances[/li][/ol]

All 4 of those have individual requirements (and inputs in the scene could possibly be mixed).
The problem with publishing is mostly that new techniques often require new ways of publishing, that might possibly fall outside of the predefined workflow.
For example if we would do ‘fur simulations’ with Shave and Haircut or Yeti then we would end up with different caches that we need to propagate from an FX department into the final lighting scene.
The same goes for liquid sims, or transfering stuff between Houdini and Maya. Maybe somebody wants to do some Marvelous Designer clothing.
And suddenly you see that there’s a never-ending fluid change of steps that are taken within a CG production.
I have tried to keep up with the amount of asset-publishing requirements projects kept adding, but every time we fall back to workarounds because of tight deadlines.

I think the best thing Publish could be is a framework for any basic artist to predefine STEPS/ACTIONS to be taken at certain stages (input-output)
And the ease of creating customized ACTIONS should be of utmost importance, because new ways of publishing or requirements will always come up.
Publish therefore should be able to handle both export and imports of data to make it of significant use.

What I would like to have is a tool that I could use as basis for setting up presets/templates for publishing per pipeline, per project, per workspace (and possibly per user?).
Let’s say I have written my own “Fur simulation export” and “Fur simulation import” as a ‘PublishAction’ (a class/plug-in?) that sticks to some of the core principles of “Publish”:

  • Filter
  • Validate
  • Conform
  • Export/Import (Process the data to be used)

And I could stick that ‘ACTION’ anywhere in the pipeline as a process, that would be great! :slight_smile:

Filter
We need to filter what nodes we actually want to include in the export.
(This could be through selectionSets, attributes, a basic set, namespaces, anything really)
If it’s really easy to add custom filters then it would be really easy to add extra ‘custom’ filters.
Think drag-drop choosing to filter by either attributes, objectSets or a multitude of those.

Validate
Similar to ‘Filter’ this must be really easy to customize, add and tweak on a per scene basis.
I separated ‘Validate’ from ‘Filter’ just because we would rather get a list of all ‘required’ nodes for publishing before we check whether they are valid.
This means that validation processes a filtered list of nodes (or a subset of that nodes).
Validations could be:

  • Check normals
  • Make sure all assets have “asset IDs”
  • Make sure all nurbsCurves (thus a subset of all nodes) have a certain attribute or length.

Conform
We can only conform data if we know what is possibly wrong with it.
I think conforming would be “to be able to auto-fix when a validation finds issues (possibly with user-prompts to confirm)”.
Again this is such a ‘per scenario’ based thing there is no ONE solution that fits all, thus allowing plug ‘n’ play (quick and dirty) customizations is key.

Export/Import
The most important automatisation here is the input-output path templating. Like configuring where files end up or where they could be.
I’d recommend using an existing templating scheme (Jinja2?) as a basis and then expand on that.


Not sure if the above summation helps or only drifts away from what you want Publish to be.
But what I mention here is what I could possibly see becoming useful.

think thats very valid point from BigRoyNL

The tricky thing with Publishing is that you create a form of output that should be easily read and magically worked with somewhere else.
“Publish” can create a multitude of output files, but if there’s no way to consistently create the next work-file it doesn’t reduce that much more hassle; just more neatly saved files.

What you describe here is something I would classify as “shot publishing”, the opposite being “asset publishing”. Shot publishing does indeed seem like a much larger ocean of individuality amongst studios and I think for the time being our efforts would be focused on publishing assets which involves a single output into a single archive, typically single file - such as .mb for character rigs, .tga for textures etc. - but possibly multiple ones - e.g. .obj, .abc and .fbx for a prop model.

Shot publishing, as it seems, is then the publishing of the output of multiple assets at once. Each asset then is treated as a generator of additional output - e.g. a character rig generates point-caches.

I’ll save my comments on shot publishing for a later time in an effort to keep the conversation focused. First, to join our jargon:

Filter

What you refer to as “filter” is what we’re referring to as “selection” and is the method with which to reduce a scene into objects to be published.

Export/Import

What you’re referring to as export is what we’re referring to as conform and is outside the scope of Publish. In a nutshell, all published files would end up relative the current working file.

See here Published file structure concept · Issue #9 · pyblish/pyblish-base · GitHub
And here Home · pyblish/pyblish-base Wiki · GitHub
And here Home · pyblish/pyblish-base Wiki · GitHub

Conform

“Conform” probably isn’t the best name for how we use it, but it’s meant as “take the finished output, and put it where it belongs”.

I think the best thing Publish could be is a framework

I think so too. I was thinking of providing this at least for validations - e.g. a plug-in system with say 1 file per validations, and a naming convention to correlate between classes/families of output (model, rig, animation etc.) and their validations. Similar to how the Python nose package would look for tests.

But it makes sense to make more parts of Publish susceptible to extension. As an example, one of the things we’ve been discussing is how to identify what is to be published, and what is not. I’ve advocated selection sets, while others have spoken of attaching attributes onto nodes. This could potentially also be treated as a separate, exchangeable component that could either be written by others or provided by us along with a few alternatives.

Initially though, I’d like to stress that getting a minimum viable product out is of primary concern, and plug-ins and templates will most likely take a back-seat for that.

for any basic artist to predefine STEPS/ACTIONS to be taken at certain stages (input-output)
And the ease of creating customized ACTIONS should be of utmost importance, because new ways of publishing or requirements will always come up.
Publish therefore should be able to handle both export and imports of data to make it of significant use.

This is a fascinating idea, I’d love for you to try and illustrate it more thoroughly if you can. For example, as a use-case illustrating a suggested step-by-step workflow, and perhaps a mock-up GUI on how artist would actually create/define these steps.

Not sure if the above summation helps or only drifts away from what you want Publish to be.
But what I mention here is what I could possibly see becoming useful.

It certainly does! At the very least it highlights some aspects that have not yet been vented and it’s good to try and at least be familiar with questions that are bound to hit us down the road at some point, like shot publishing.

The same customization requirements hold true for ‘asset’ publishing over ‘shot’ publishing.

We would check for:

  • naming conventions,
  • nodes from plug-ins (give possible warning; ask to clean-up if required),
  • normals (a variety of checks)
  • UVs

In some projects we might also need ‘multiple UV sets’-checks because a certain step in the project can’t handle multiple UV sets (I think Alembic is actually one of them at this point.).
But in others we might not. So the validation process should be easily ‘configurable’ and should allow for creating (sharable) presets. Of course each validation should be as small as possible to gain modularity.

Note that many output types have very customizable importing options (even .obj and .fbx can really mess up on import with the wrong settings) which could benefit from similar automatisations and workflows as described by Publish.
Since Publish already defines the ‘configurations’ I think it’s core even could work for importing assets. I agree that this blurs the boundaries between publish and asset management even further; yet the task of importing the exported files in the correct manner should not be underestimated. (Without automatisation it’s almost as prone to human error as exporting/publishing).

Upon publishing an asset (model) we also require to add ‘asset IDs’ to our objects, which is our pipeline’s tracking system for Maya objects (within the scene).
This way it’s not related to names and/or namespaces, because those always get f*cked up at some point.
And we use this to publish ‘shader-relations’ from our separate lookDev (shading) workspace towards the final lighting files.
Basically merging what has been set-up in ‘lookDev’ to the published ‘animation’ file as a ‘lighting’ file.
This ‘add asset IDs’ must be easily added into “Publish” as a validation/conform step.

For a simple model publish this could be the workflow:
Selection
Let’s say a user is able to select a ‘SELECTION’ filtering method (or a combination of them) and uses the following SELECTION FILTER STACK:

  • ADD NODES by Type: Shapes
  • FILTER NODES by Type: nurbsCurves
  • ADD NODES by Attribute + Value: “.forcePublish” == True

At the end of Selection we have all shape nodes that are not nurbsCurves + all nodes that have a forcePublish boolean attribute set to True.
We save this as a Selection Preset so we can use it in multiple places (and only have to update this single preset to update all publish configurations that use it).

Validation
For this model we need to check a multitude of things:

  • Names: Naming conventions
  • Names: Non-unique names
  • Mesh: Non-manifold
  • Mesh: Zero length edges
  • Mesh: Zero area faces (note: maya’s built-in functionality is rather slow here on high-res meshes!
  • UV: Space 0.0 to 1.0 (may not exceed single UV space)
  • UV: Single UV Set
  • Other: Objects must have asset IDs (create if needed)
  • Clean-up: Ensure no animation exists (likely unwanted for published models)
  • Clean-up: Remove from display layers
  • Clean-up: Remove from render layers (if any)
  • Clean-up: Remove all shaders & assign (default) lambert1
  • Warning: Raise warning if external plug-ins are used.
    (Note that this is just a possible small subset of the entire list of validations.)

By separating all validations and fixes we should be able to easily make a new preset and enable/disable or add/remove/edit if a project requires.
This also allows us to easily backtrace to a single validation implementation if any errors are found.

Extract/Conform (Export):
Save the resulting mesh out to both a .mb (Maya Binary) file and .abc (Alembic) to a user-given path. (Possibly automated with configurations)

Just want to let you guys know I am loving this thread!!!
Keep up the very interesting project and thanks for sharing :slight_smile:

I agree that this blurs the boundaries between publish and asset management even further; yet the task of importing the exported files in the correct manner should not be underestimated. (Without automatisation it’s almost as prone to human error as exporting/publishing).

That is a very good point, and one I had not considered.

It makes me contemplate whether or not it may be an idea to pivot Publish into either a pure-validator - in that it only performs checks on data within a host - or to also handle imports, which would essentially make it into an asset library/scene-builder. I see now why you in your original post you felt it difficult to scope where Publish could fit in.

Pivoting

  1. As a pure-validator, it would ultimately be up to the studio/user to validate data on its way out which is somewhat defeating the purpose of quality assurance. That’s one reason I felt it necessary to include the export-part of publishing into Publish.

  2. To include imports/library features into Publish might possibly dilute its value as we’d have to satisfy a much larger portion of users.

  3. An alternative might be to provide three separate parts; one for imports, one for validations and one for exports and allow end-users to only use what they need. The simplicity of the idea would somewhat vanish in doing so however.

The same customization requirements hold true for ‘asset’ publishing over ‘shot’ publishing.

I’m glad you said that, but I honestly can’t wrap my head around it. Publishing a model involves checking every node in a scene for faults, that’s fine. Publishing an animation however involves checking a rig for faults and to me that sounds highly dependent on the rig itself. Perhaps there isn’t a way to include publishing of point-caches without enforcing conventions into the rigs themselves, what are your thoughts on this?

I’ll digest some of this before answering the rest of your post!

Just want to let you guys know I am loving this thread!!!
Keep up the very interesting project and thanks for sharing :slight_smile:

Hey, thanks for the shout-out and glad you like it. :slight_smile: We’ve always got room for one more if you’re interested!