OSDev.org

The Place to Start for Operating System Developers
It is currently Sat Apr 27, 2024 9:40 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 13 posts ] 

Should I allow copy-on-write directories?
Yes 75%  75%  [ 18 ]
No 25%  25%  [ 6 ]
Total votes : 24
Author Message
 Post subject: Design problem/question on FS design
PostPosted: Fri Jan 05, 2007 1:29 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
I'm working on the *FS mkfs tool. It works up to the level that I've come to the part where you take minute decisions that make a huge impact later on.

I'm implementing the support for copy-on-write files, where a file inode is shared by multiple file links, which can only write to it by copying it before the actual write. This works fine. I am considering, if you allow this for directory inodes as well, writing a file includes checking all of its predecessors and possibly copying a number of them. Not allowing this would weaken it since you'd still get loads of copies of the directory tree.

What do you think?


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jan 05, 2007 1:54 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 9:29 pm
Posts: 2426
Location: Canada
Sounds cool Candy.. :)

*FS? I'm guessing by the asterisk your designing your own file system and haven't picked a name?

How about CFS? Candy File System.. hehehehe :P

An example of COW here is neat.. Surprised I never seen it before :?

To me it just seems like a bit of a space waister.. As it looks like it's just making duplicates of the file.. How's it different then making a copy of the file normally using cp? (Or is the duplicate temporary.. in memory?)

_________________
Image
Twitter: @canadianbryan. Award by smcerm, I stole it. Original was larger.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jan 05, 2007 2:07 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
Brynet-Inc wrote:
*FS? I'm guessing by the asterisk your designing your own file system and haven't picked a name?

The name is StarFS, short name is *FS.

Quote:
An example of COW here is neat.. Surprised I never seen it before :?

To me it just seems like a bit of a space waister.. As it looks like it's just making duplicates of the file.. How's it different then making a copy of the file normally using cp? (Or is the duplicate temporary.. in memory?)


Suppose you have a source tree. You make a daily backup by copying it to a subdirectory of the appropriate name with a date in it. Every copy takes space for the entire tree of files. The idea is, since the files are the same, and you're most likely not going to write any of them (since the most likely reason for copies is backups or something similar) why copy at all?

So, instead, you link to the inode like a unix hardlink. Except, when you write you instead copy & write the file so it appears to be just another file.

It allows for a number of things that previously were plain stupid:

- You could make your file system function like GMail, in that you don't put a file in one directory but you put it in each directory that applies. As long as you don't change it, it'll be the same in each directory.

- You can add intelligent diff management to the filesystem so instead of actually COWing the file it creates a new inode with a backlink and a diff method, and only stores the diff. That way you never delete anything that you've backed up and you won't waste space duplicating half the file either. Not too sure on this though, but it's a point we were thinking on.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jan 05, 2007 4:32 pm 
Offline
Member
Member

Joined: Tue Nov 21, 2006 3:17 pm
Posts: 29
Yes, but I would include the diff system with the filesystem. I don't see a reason why you wouldn't want it...except maybe speed...


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jan 05, 2007 4:59 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
I got a preliminary version of mkfs on my subversion, which follows an XML file to create a filesystem with a preloaded set of information. An example script is also included (that makes a disk image of my OS, of course) including all code and scripts to do so (but actually executing that takes about an hour on my machine, mostly due to compiling two custom crosscompilers).

I'll try to upload an image if anybody cares to take a look at the binary structures created as such. It merges identical files, but doesn't try to do that with directories (yet). It's also pretty limited in a few other aspects, but I think this'll work for a basic version.

I might rip out the hashes from the inodes, they're large. It still needs info for allowing diffs instead of just files. I'm also not quite sure on the disk spanning methods to be used so they might grow a bit and lose the hash, so pretty much stay the same. The file / directory / section structs aren't final or even close to. There's no way to find out where the inode file is right now, you'll have to wait until I add that info to the boot section somehow. The system information is in the first four inodes, in the order [ inode, boot, free, section ]. The inode file contains the inodes, the boot file contains the boot block (32k at the start), the free file contains all the free extents (and is atm the only indirect file) and the section file contains all info on the sections.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jan 05, 2007 5:15 pm 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9301
Location: On the balcony, where I can actually keep 1½m distance
Sounds like: "Why would we put a jet engine into a motor cycle?"
Answer: "Because we can" 8)

Apart from that, I like the idea. And since you are COWing files, doing it for directories is just the logical next step.

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject: Re: Design problem/question on FS design
PostPosted: Fri Jan 05, 2007 11:01 pm 
Offline
Member
Member

Joined: Sat Nov 25, 2006 6:33 am
Posts: 67
Location: PRC
Candy wrote:
I'm working on the *FS mkfs tool. It works up to the level that I've come to the part where you take minute decisions that make a huge impact later on.

I'm implementing the support for copy-on-write files, where a file inode is shared by multiple file links, which can only write to it by copying it before the actual write. This works fine. I am considering, if you allow this for directory inodes as well, writing a file includes checking all of its predecessors and possibly copying a number of them. Not allowing this would weaken it since you'd still get loads of copies of the directory tree.

What do you think?


So when it's sharing time,only links(references) will be added to indicate the shared files;and only when actually writing occurs will they be copied?

The key point is that links(pointers or references) are seperated from the contents(datas),may I understand it like that?

Will a shared file be completely copy when it's going to be written through one or more links(sometimes it's unnecessary or may slow down the process)?Or you're going to generate additional minimal descriptions to specify just what(which copy and the location within it) is modified(when the file is read next time,the FS will combine the original one and the certain modification description (related to the requested copy) to return a final output)?

Generally I think,if the backup is for updating in different period,that will work fine.

Anyway,I think that's good. :)

Combuster wrote:
Sounds like: "Why would we put a jet engine into a motor cycle?"
Answer: "Because we can"

Apart from that, I like the idea. And since you are COWing files, doing it for directories is just the logical next step.


Yeah...
Copy-on-write may help more in a distributed system,however similar techniques can also be valubale in desktop areas.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 08, 2007 2:07 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
I've been tweaking a few bugs out of the mkfs tool and it now properly handles huge images and lots of files. Current example to show that COW does actually save space (this is without directory COW but with file COW):

Quote:
candy@blackbox:~$ ls -l /data/disk.img
-rw-r--r-- 1 candy users 16492674416640 2007-01-08 21:10 /data/disk.img
candy@blackbox:~$ du -sh /data/disk.img
4.3G /data/disk.img
candy@blackbox:~$ du -sh .
12G .
candy@blackbox:~$


The example target output is 15TB large (terabyte), since it's a sparse file it fits. It takes 4.3GB of space in total, including management information for the files. The biggest directory that's in it itself takes 12GB on the host file system. That's about 60-65% compression without any compression at all. The downside is that writing a big shared file is slower, but I'm gonna work on a diff thing that allows that without duplicating the files.


Top
 Profile  
 
 Post subject: Re: Design problem/question on FS design
PostPosted: Mon Jan 08, 2007 5:03 pm 
Offline
Member
Member

Joined: Sun Oct 24, 2004 11:00 pm
Posts: 134
Location: North Dakota, where the buffalo roam
Candy wrote:
I'm working on the *FS mkfs tool. It works up to the level that I've come to the part where you take minute decisions that make a huge impact later on.

I'm implementing the support for copy-on-write files, where a file inode is shared by multiple file links, which can only write to it by copying it before the actual write. This works fine. I am considering, if you allow this for directory inodes as well, writing a file includes checking all of its predecessors and possibly copying a number of them. Not allowing this would weaken it since you'd still get loads of copies of the directory tree.

What do you think?


This is the coolest idea I've heard here in a long time. Seems like it could lead to pretty sever fragmentation though. Have you put any thought into avoiding this?


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 09, 2007 2:53 am 
Offline
Member
Member

Joined: Mon Jan 08, 2007 3:19 am
Posts: 30
Location: UK
My immediate thought for an application was setting up filesystems for chroot (or similar) virtual machines -- you'd no longer need a copy of the OS and standard applications for each VM, it could all be done with COW links.

Nice one. :)


Top
 Profile  
 
 Post subject: Re: Design problem/question on FS design
PostPosted: Thu Jan 11, 2007 11:50 am 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
rexlunae wrote:
This is the coolest idea I've heard here in a long time. Seems like it could lead to pretty sever fragmentation though. Have you put any thought into avoiding this?


I'm not sure on how you figure it'd cause fragmentation.

I do intend to avoid fragmentation due to a few tidbits on how to:

- When you write to a small file or directory, instead of always overwriting and reusing the existing sectors, allocate a new extent using best-fit allocation (which can/should include the existing current block) and write it there instead. That should lead to less fragmentation due to small increments in filesize.
- Attempt to keep extents at least a number of clusters in length. The longer the extents are the less fragmented the files will be.
- When handling a large file, attempt to keep it in extents of a logical length, which depends on the filetype itself. For example, for MP3 files, keep 1M segments, for MPEG2 video keep a segment of about a group of pictures etc. That keeps the performance more deterministic in most cases leading to more reliable behaviour. I'm not quite sure on how to do this but the FS itself has support on the inode level for storing the file type of the item, so the info is/can be available...

I'm still considering whether I'll let very small / active files be in the log only. I think I might, but not quite sure. That would lead to less fragmentation as the log is by its nature rewritten often.

Do you have any ideas to avoid fragmentation?


Top
 Profile  
 
 Post subject:
PostPosted: Thu Jan 11, 2007 1:56 pm 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9301
Location: On the balcony, where I can actually keep 1½m distance
For the average case, using this to prevent fragmentation should be pretty effective without being bad for stability. If you want to take it to the extreme, you should read yourself into the likes of XFS...

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject:
PostPosted: Thu Jan 11, 2007 3:46 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
Combuster wrote:
For the average case, using this to prevent fragmentation should be pretty effective without being bad for stability. If you want to take it to the extreme, you should read yourself into the likes of XFS...


Actually, it's better for stability. You only need to have an atomic update for the inode, which is fairly trivial to do. This way you can un-journal the other files without a problem. You just write them to a number of sectors that were/are unused, overwrite them without any care and when that's done you overwrite the inode. Sector writes are atomical, inodes are <= one sector (atm 1/8th of a sector) and you will never get a failed operation. Not even a journal needed for this.

I will add a journal for larger transactioned actions such as recursively removing a directory. Going to have a lot of fun testing that ;)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: SemrushBot [Bot] and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group