- Posts tagged visualization
- Explore visualization on posterous
Adaptive Mesh Refinement Volume Rendering with Python and Cython
Over the last six months or so, I've been working on implementing a software volume renderer for adaptive mesh refinement data -- specifically, for 3D visualization of astrophysical data. This has been implemented in the yt toolkit and it's been developed in the open. I'll give a bit of background, and how various other people have had their hands in this. There are two things that make me really happy about this new development: the first is that I have an actual use case for it, and I've been using it quite a bit for that this week (rendering my data!) and the other is that a number of people in the development community have had their hands in it, from start to finish.
The volume renderer we ended up with isn't the best -- it still has artifacts, and I can definitely see where it could use improvement! -- and it's not the fastest, but it's something that several of us worked on and it's something we can use to make viz of our data. I think it's probably better thought of as a way of preparing a visualization, not a way of engaging in visualization.
Background
I conduct cosmological star formation simulations. To study the formation of stars on cosmological scales, we use a technique called adaptive mesh refinement, wherein regions of interest are calculated at higher resolution than comparatively boring regions. While this enables us to do awesome things like zoom in by a factor of over 10^12 in length scales, it also means handling the data is more difficult. A couple years ago, my thesis advisor suggested I start work on an analysis toolkit written in Python. So, I did. It handles objects as described by physical regions (as opposed to code regions) and it abstracts away things like addressing of objects, data IO, generating fields, and so on and so forth. This way, we could address things as "slice here through the domain" or "get me a sphere" and the toolkit would go on its merry way and take all the necessary steps to get the data and then get out of the way.
Over time, new features were built on top -- the viz capabilities mostly stayed where they had been since 2006, with a few additions but almost all were reductions to two-dimensions. The only 3D visualizations we had were based on VTK, utilizing their HierarchicalBoxDataSet object. The image below shows the yt "next-generation" GUI applying marching cubes to an AMR dataset, using the VTK operators and data structures. There's a corresponding user interface, but it's still rough around the edges so I haven't included it here.
Late last summer, a colleague (and friend) of mine, John, implemented a mechanism that took the oblique-slice data object I wrote and made it into a "fixed resolution" oblique-slice. By stacking up a bunch of these fixed resolution oblique slices, the equations of radiative transfer could be solved and a "volume rendering" could be obtained from it.
Casting A Couple Rays
John suggested that it could go a bit further, and so I started looking at actually performing direct ray casting through the simulation. This is where a ray is thrown, and then it accumulates a value -- typically this is via a transfer function, where at every point the ray picks up emission and then via absorption it is diminished. I'd written a single-ray caster back in 2008, so extending that to serve the new multiple-ray purpose wasn't terribly difficult. However, what turned out to be the biggest difficulty was actually in the process of traversing grids. In the adaptive mesh refinement implementation we utilize, some regions are covered by multiple grids -- "child masks" are applied to coarse grids to indicate where finer (better) data is available. For the longest time, I had this mental block in my head that what I'd have to do is figure out a way to have a ray enter and exit a grid while traversing that very grid; I couldn't wrap my head around how to avoid this problem, or even how to handle it if I decided not to. All I could see was that we had a bunch of rays. These rays had to pass through every cell they intersected with, and they had to do it in order.
At my old institution, the wickedly smart Ralf K was the visualizer on-hand. Besides being a code demon, he also wrote his thesis on the visualization of adaptive mesh refinement data. During a conversation with him about this, he explained that typically what's done is that the grids are broken up into "bricks" -- non-overlapping, finest-data objects that fully tile the domain to be rendered. (Ralf had also taken me under his wing and shown me techniques for visualization last Spring, before I graduated. We prepared some visualizations of a simulation, and without exception his blew mine out of the water, even using the same software!)
The first attempt at a grid homogenization algorithm that I came up with was terrible. It generated 64 bricks for every grid (It is not coincidental that this is (2^3)^2, I think) and it was slow. Plus, with that many small grids, the advantages of viewing the cells as collections instead of individuals were basically nil. This is where the project languished for a couple months. We could make okay renderings, not great. We would do the stupidest thing and linearly interpolate at N points along the ray between intersection points of cells -- even if it was only a glancing blow, the ray would interpolate N times between enter & exit. This particular behavior hasn't changed, but it might some day. There were artifacts, they were slow, and nobody really cared about making volume renderings -- for those reasons!
Homogenization
Over Christmas break, I was waiting for simulations to run, it was snowing, and I needed a fun project to spend some time on. There are some papers that discuss it, but I am embarrassed to admit that most of them took it as written that the process of homogenizing adaptive grids is a solved problem. I forget when it was, but I remember finally got over my inhibitions: I have had an aversion for the last couple years to writing code that I know is ugly, or that I know is inefficient, when I feel that things could be done better. But, after I'd had enough, I just decided to let it go and I pushed forth.
I'm still not sure it's the right method. But what I ended up doing was writing classes for GridFaces and ProtoPrisms. The protoprisms sweep over directions until they encounter a grid face with which they intersect, and then they sweep in the next direction until the same. Once a protoprism without any intersecting grid faces is identified, that prism becomes a "brick." (The code, in yt/_amr_utils/VolumeIntegrator.pyx, is probably clearer than I am being right now.)
The final implementation I came up with only generates 2-8x as many bricks as grids. It's sub-optimal, as it doesn't perform any minimum bounding box calculations, but it met my standards for a "success"!
Transfer Functions, Color Maps and Artifacts
So now we had a ray caster that worked. All it needed was data, a transfer function to describe how to update the red, green, blue and alpha channels, and the will to push "go." Also, the resultant images had artifacts, which was bad, but staring at the source code I couldn't see the sources. That aside, however, making a transfer function was simply not easy, and I wasn't good at it. So, using Enthought's TraitsUI, I sketched up a quick transfer function widget. You can add new Gaussians, adjust their properties, and it calls a callback routine whenever "Run Routine" is clicked. I was pretty happy with it, and it let me make some interesting images. Seen here is a turbulent present-day star forming region, courtesy of my officemate David.
The problem with this, however, is that I'm not really very good at generating color maps or schemes! Fortunately, Jeff wanted to make some visualizations of data he was working with. He came up with the idea of taking existing colormaps, in particular those provided by matplotlib, and using those as input to the transfer function. The widths and the alphas were free parameters, but using the RGB combinations was a great idea. Sam ran with this a bit, and I added some functions to add layers, and it all came together.
Sam, though, gets a substantial amount of credit for taking a look at the artifact problem. He realized that rays were 'bunching up' at the edges of grids, so any isocontour (really, any non-zero emissivity, but we mainly use gaussians for isocontours) would show up grid lines that it crossed, very strongly. The mistake was not only mine, but it was a mistake I came very close to identifying a few weeks before Sam fixed it. (Note that I said: came very close to identifying, not fixing!) With this fix in place, and Jeff's colormap ideas, we had a working volume renderer that made images we could all be pretty proud of.
This image, made by Devin, is one that I think really shows off what makes me so proud of this volume renderer. First, he defined a field that required extra cells to create a stencil; yt then fills the necessary cells from neighboring grids. His function then creates a field. (In this case, enstrophy.) This field is then interpolated to vertex-centered data, homogenized and passed to the ray caster, where he then dumps back to disk and makes a movie. I think it kind of looks like a Sentinel, from the Matrix.
Optimizations
The ray caster itself, along with the grid homogenization is written in Cython. yt is a python package, but it includes hand-tuned HDF5 readers in C, and a number of the compute-intense routines are either in C, Cython or Fortran. The usage of Cython for the ray casting has been an amazing boon. "cython -a" is outstanding, as it drops out an html file with highlighted hotspots.
The zeroth pass of optimization was to try to make the algorithm as good as possible! The first pass of optimization was to get rid of all of the yellow lines. This wasn't terribly hard -- we mostly use pointers and C data types. The final pass, though, was to utilize the Shark utility on OSX. This is a free tool, provided by Apple, that can profile C/Fortran/whatever code. Cython helpfully comments all the generated C code with the original line numbers, and it's relatively easy to figure out the correspondence between the Cython code and hotspots in the C code.
It turns out that the second order optimization actually revealed things I missed in the zeroth order optimization -- I was doing stupid and unnecessary things to interpolate values inside a cell. After some tweaking, touchups and so on, the volume renderer actually sped up for some people by between 2x and 9x! Much of this was due to culling cells that weren't inside the transfer function boundaries, removing calls to expensive functions like "fmod" and so on. By multiplying by pre-calculated numbers, rather than dividing on-the-fly, we squeezed out maybe 10% or so.
At this point, I was happy with where the volume renderer was, and I was ready to start making movies and images. So I did.
This is just a still frame from one of my simulations; I have deliberately placed the isocontours here. In the gallery (linked below) I have a zoom-in movie from a different simulation where the contours have been placed at evenly-spaced intervals.
Images and Movies
Things have progressed a long way! Sam, Devin, Britton and I have all made some movies we can be proud of. Sam has set up a gallery of his own and we've got an official yt vimeo channel and a gallery from that, too.
Next Steps
The very next step is going to be allowing the alpha value depend on a different field than the RGB values depend on. John has suggested we give this a go to try to generate simulated images; I know this is Jeff's biggest priority, too. I believe that this addition should be relatively simply, with a great deal of scientific payoff.
For now, though, I'm really, really enjoying using this on my own data. That's why I wanted it in the first place -- to understand my data better, to reveal things about my simulations, to inform my intuition and assist my analysis.
There's a great body of research on the subject of volumetric visualization of adaptive mesh refinement data, by people that have thought about this a lot harder and in more depth than our group has -- and their results definitely underscore that! Be sure to check out the amazing renderings available, and remember that the results above are still, one-off images -- not interactive explorations. Those qualifiers aside, I'm still really proud of what we can make with this volume renderer, and I'm even more proud that it's all been a group effort, with the code all open source and available.
This writeup hasn't ended up being quite as technical as I originally thought, but I hope it's been enjoyable for you. Be sure to check out the links to the galleries -- there've been some really great movies and images, showcasing some really fun scientific research. Please also feel free to leave comments or questions -- and thanks for making it this far.




