Our organization is subject to data retention requirements over very long periods (50 years or more). Do your organizations face similar constraints? If so, what long-term retention strategy have you implemented for these regulated data, and what technologies or solutions do you use to ensure their durability and compliance?
Sort by:
Consider storing all data as plain text if possible and using immutable data stores. Fluree is an example of such a data store. Datomic is another. These are niche technologies, so few IT teams will have heard of them, and even fewer will have experience with them, but those that do seem to enjoy working with the technology. Interestingly, both those products have open source cores and are written in Clojure, which is a LISP that runs on the Java Virtual Machine.
Plain text (UTF-16 for the best compromise between the number of characters supported and the space required to store them) has a good chance of being readable by something even a hundred years from now.
I have no direct experience, however I know that regulatory obligations on the nuclear industry insist that design, build, operational data and documentation is securely held for the full lifecycle of a nuclear installation including decommissioning and thereafter so around 75 years plus. Additionally there are requirements that data and the format of that data is periodically inspected (i.e. can you still read PDFs etc ...) and ensuring that storage media is still accessible and available to handle situations where the storage medium has itself become obsolete and so data/docs need to be migrated onto current media. There are (specialist) data management solutions used by the nuclear industry to handle the above, and they may cover life sciences or other industry needs too.