Saving Documents in PG vs File system
Doug Easterbrook
doug at artsman.com
Wed Nov 14 14:25:51 EST 2018
hi Mats.
we store stuff in the database — using postgres. Some argue good idea, some argue not a good idea. I’ll leave that argument out of this.
how to manage large BACKUPS — which is the question that I’m hearing, if one chooses to backup PDF’s in the database for simplicity.
if you use pg_dump for your backup, do a pg_dump —help
one of the flags is -T where you can ‘exclude’ certain files.
so if you want a bunch of daily backups without your large document file, you can do something like.
pg_dump -F c -T documentTable database > destFile
and then once a day or once a week go after the entire database.
or, you can specifically backup just the documents database using -t
pg_dump -F c -t documentTable database > destFile
if you have replication set up (we like to do that), and have a live copy of the database, now its into risk management where you can
1) assume that replication is running and you have two complete copies of the database at all times
2) dump all but the large file on a daily basis for off-machine safety
3) dump the large file weekly and take a risk that you won’t lose both replication and the main database in one day (or week).
even if you store the large files out side of the database, you are faced with backing them up. you can do incremental if you want using some file system backup or time machine like tool — but sooner or later you’ll get to a spot where getting them back for a test database or what have you necessitates a full dump or a weird restore process.
so I like to be clean and use one common appraoch
- dump everything once in a while (if thats needed)
- dump the small stuff all the time
- use replication
and its all postgres and easily importable/usable by the database & application.
just my thoughts
Doug Easterbrook
Arts Management Systems Ltd.
mailto:doug at artsman.com
http://www.artsman.com
Phone (403) 650-1978
> On Nov 14, 2018, at 10:04 AM, ADJob <mats at adjob.se> wrote:
>
> Hi,
>
>> A question for those of you who store documents in your Omnis system. We currently do, and we have a separate DB for the storage, and keep the metadata in the main database.
>> However, in certain cases, the PG database size is now over 100gb when exported. And 99% of that data is static, so a waste of time in backing it up etc every day.
>>
>> 1 I could split the database into 2 parts as well, 1 for new items, which will periodically move items to the 2nd archive database.
>> 2 But I want to explore storing the files in the file system, and passing them to the user when asked for. Can PostGres save files locally, and then read team and send them back to a user within a SQL call?
>
> AFAIK Postgresql cannot interact with the operating system. I think you will have to use Python, Go or any other app to do this stuff. (I have been told)
>
>> 3 I also remember Marten Verhoven showing how he uses Symbolic links to files that would normally be hidden AND inaccessible unless given.
>>
>> So what do others do at this point with numbers of docs exceeding 100Gb now?
>
> Store max 100 K files each within the database as large_objects. Bigger files should IMHO be stored in the file system with some sort of reference in the database.
>
> FWIW,
>
> /Mats
>
>
>
>
>
>
> _____________________________________________________________
> Manage your list subscriptions at http://lists.omnis-dev.com
> Start a new message -> mailto:omnisdev-en at lists.omnis-dev.com
More information about the omnisdev-en
mailing list