See the bottom of this post for a glossary that may help Gen3 non-experts understand it.
The problems:
Right now the Gen3 butler has two high-level methods, pruneCollection and pruneDatasets, which try to cover all operations that look or smell like dataset deletion, including all of those handled by the also-public remove, removeCollection, and disassociate methods of Registry.  The Butler methods (and their command-line counterparts) have been tough to maintain, test, and use, and I think it’s fundamentally because they try to do too much: we’ve ended up trying to support many operations that no one may ever use, just because they’re the logical combination of various options/arguments we need for other reasons (e.g. unstoring the datasets in a TAGGED collection).
I also think many of those operations need to be removed now to avoid complicating our ownership model in future data repositories with a real concept of user or group ownership of datasets; if one can modify a dataset via a reference to it from some non-RUN collection, we’ll need many different more kinds of ACLs.
In addition, right now we have one particularly important pain point, captured on DM-28857: it’s currently hard to delete the collection structure produced by pipetask (and as of DM-28960, BPS), which involves a CHAINED collection that references both output RUN collections and input collections of many types.  One can’t delete the RUN collections first, because that trips a foreign key violation as long as they are referenced by the CHAINED collection, and if one deletes the CHAINED collection first, the easiest way to find those RUN collections also goes away (but note that one doesn’t want to delete the input collections, and butler has no way to tell the difference using the CHAINED collection, so it’s not that easy).
Finally, these methods are designed to encourage only unstoring datasets (while leaving their Registry description), to preserve provenance, but this is premature and annoying to users: they want to fully delete things, because there isn’t actually any provenance to preserve, and I think we need to provide a better way to “hide” collections before we make it too hard to fully delete them.  That seems doable via an extra flag column in the collections table, but only with a schema change.  Since adding provenance also will require a schema change, we can do those at the same time (later).
The near-term proposal:
- 
We add a new method, Butler.removeRunswhich fully removes one or moreRUN-type collections and all of the datasets within them (I’ve started this on DM-29106).
- 
We remove the Butler.pruneCollectionsmethod, leavingButler.removeRunsas the recommended way to deleteRUN-type collections andRegistry.removeCollectionsas the way to remove all other kinds of collections (which would no longer involve any kind of dataset deletion, because the references to datasets from those collections don’t imply any ownership that should allow one to do that).
- 
We also remove the Butler.pruneDatasetsmethod, leaving us with no high-level way (for now) to fully delete individual datasets. I don’t think we have a use case for this right now, and I’d like to give us a chance to think about the future ownership model and actual use cases before reintroducing something like it (and I expect it will be replaced by multiple simpler methods for different kinds of deletion, as I am proposing we do now for collection).
- 
We change the deletion logic for collections to allow child collections to be deleted while they are referenced by CHAINEDcollections, by replacing them there first with a special sentinal “[deleted]” collection, which can be used as a way to notify the owner of theCHAINEDcollection that this occurred.
On the command-line side of things,
- 
butler prune-collectionandbutler prune-datasetswould go away;
- 
butler remove-runsandbutler remove-collectionswould be added (the former would deleteRUNcollections and always delete datasets; the latter would delete non-RUNcollections and never delete datasets).
- 
we add a pipetask purgecommand, which deletes all output-RUNcollections and the outputCHAINEDcollection matching the usual pattern;
- 
we add a pipetask cleanupcommand, which deletes all output-RUNcollections that are not referenced by the namedCHAINEDcollection but do match its name pattern (i.e. those left behind by--replace-runwithout--prune-replaced).
The last two options belong on pipetask, not butler, because it’s pipetask that defines the naming convention they rely upon to know what to delete.  They would not necessarily be able to work on processing runs where --output-run was used to customize the RUN names.
Glossary:
unstore: delete files andDatastorerecords without (necessarily) deletingRegistryrecords.
forget: deleteDatastorerecords without deleting files or (necessarily) deletingRegistryrecords.
RUN: a kind of butler collection that datasets intrinsically belong to
TAGGED: a kind of butler collection that only references datasets
CHAINED: a kind of butler collection that references other collections
disassociate: remove a reference to a dataset from aTAGGEDcollection