This is a part 2 of my adventures of making version systems go boom.
As I described before, I need to version some reasonably large files. After trying Mercurial and Git, I decided to go with git as it presented me with fewer problems.
To make matters worse than before, I now need to version 3 files which are about 2.7GB in size each. I tried to git-add the directory, but I got this wonderful message:
$ git-add dir/
The following paths are ignored by one of your .gitignore files:
dir/ (directory)
Use -f if you really want to add them.
$ git-add -f dir/
fatal: dir/: can only add regular files or symbolic links
Wha?
- I don’t have any .gitignore files in this repository
- Adding a directory like that worked (and still works!) on other directories
Really painful. Time to experiment, but first I run git-status to see what other files I have not committed yet, and I see everything listed except the directory!…So, I moved one of the files to the top directory of the repo, ran git-status — the file did not show up — but tried to add it anyway:
$ git-add file
fatal: pathspec 'file' did not match any files
Ok, this time around, I at least get an error message which I’ve seen before. It is still wrong, but oh well. Thankfully, the program that uses these files has be made in such a way that it can handle filesystems which don’t support files larger than 2GB. I regenerate the file, now I have 2 files, the first one 2GB and the other 667MB. git-status displays both — great! git-add on the smaller file works flawlessly, but…you guessed it! Adding the larger file dies? Which error message?
fatal: Out of memory, malloc failed
Yep, great. My laptop’s 1GB of RAM just isn’t good enough, eh? I’m not quite sure what I’ll do, I’ll probably scp everything over to a box with 2+GB RAM, and commit things there. This really sucks :-/
Update: I asked around on IRC (#git) where I got a few pointers and the code confirms things…it would seem that git-hash-object tries to mmap the entire file. This explains the out of memory error. The other problem is the fact that the file size is stored in an unsigned long variable, which is 32-bits on my laptop. Oh well, so much for files over 4GB. I think, but I’m not sure - I’m too lazy to check — the stat structure may return a signed int which would limit things to 2GB — which is what I see.