Hi!
I’m a developer of Awkward Array, a Python package for manipulating large, irregular datasets: JSON-like data (variable-length lists, nested records, missing values, mixed data types) with a NumPy-like interface and performance (slices, implicit-loop functions, reducers, etc. on contiguous numerical buffers).
The project was originally developed for LHC data analysis, but the concept is generic and we’re looking for use-cases in other fields. I know that sky images are rectangular, but there might be analysis steps at a later stage of processing that require large arrays with variable-length lists or other data structures, such as lists of candidate objects and their associations.
We’re also trying to fully integrate Awkward Array in the scientific Python ecosystem; it already interoperates well with pyarrow (Apache Arrow) and Numba, but we’re also looking at Dask, Zarr, RAPIDS (cuDF), and maybe Xarray. The scientific use-cases are the primary drivers of integrations like these, which is why we’d like to hear about your use-case.
We’re also working on adding a GPU backend, so that the same operations that work on CPU-bound arrays of irregular data work transparently on GPUs (replacing sequential algorithms with parallel ones under the hood). If you have an application that uses GPUs or might use GPUs in the future, that would also be interesting.
(Apologies if you’re seeing this twice because you’re on the Astropy mailing list!)
Thanks!
– Jim