You can use a hashing scheme, for example, to create directories and subdirectories to locate new files.įinding a file using the file name and directory structure is very fast as most filesystems today index their directories. If your primary indexing works well with directory names, then that works just fine. Otherwise, you might have a race condition (such as a process reading a file that is still being written, and may get to the end before the writing process is complete - ugly race condition). Then you never have an "intermediate" file in the "DB". Basically write the file fully to a temp file, then rename or mv it in to its final place. it either works or it doesn't and there's never a missing "in between state"). Rename and mv are atomic operations on a Unix system (i.e. Then, when you are done, you simply rename (either the system call rename(2) or the shell mv command) the old file over the new file.
Mail servers and NNTP servers of the past really pushed the limits of how far you can really take these things (which is actually quite far - files systems can have millions of files and directories).įlat file DBs two biggest weaknesses are indexing and atomic updates, but if the domain is suitable these may not be an issue.īut you can, for example, with proper locking, do an "atomic" index update using basic file system commands, at least on Unix.Ī simple case is having the indexing process running through the data to create the new index file under a temporary name. Flat file databases have their place and are quite workable for the right domain.