Jump to content

What other file systems can use space like APFS from High Sierra


G+_Daniel Stagner
 Share

Recommended Posts

Screen252BShot252B2018-01-17252Bat252B10

What other file systems can use space like APFS from High Sierra. Please see the attached photo if you are wondering what I mean.

 

The other day I started moving all my photo libraries to one hard drive in order to search them for photos of my brother. I wanted them in a local USB drive for ease and speed of access. I decided to place them on an apple JOBD RAID that I cobbled together from two 4 TB hard drives. I went with APFS knowing that APFS would likely be a good choice for the file system due to the extra copies of images files I was likely to generate.

 

I was not expecting that APFS could use space so effectively. As I copied library after library to the raid it just stopped taking up space. This was due to the multiple libraries being backups of each other over the years.

 

In any case what I want to know is there a file system available for FreeNAS or it's like that will do the same thing. Also is APFS using hashing to identify the duplicate files or something else.

Link to comment
Share on other sites

EMC has an enterprise device (called "Data Domain") that does de-dupe on the byte level. Exceptionally useful for database backups where you might want to keep 30+ days worth of backup without actually consuming 30x the database's size.

 

I'm not aware of any other consumer file systems that do this though. Definitely curious!

Link to comment
Share on other sites

Ben, It looks like the EMC product is pretty sophisticated. The Stream-Informed Segment Layout (SISL) Scaling Architecture white paper did not help me understand how the deduping process works. I did get that its done in memory instead of on disk and it takes up less memory than other traditional processes. Well, I am not that smart and not well suited by my education for the type of information it provided. I also get that it uses checksums before after and during each data read wright session like ZFS. I figured it may just be keeping a log of the checksums and not rewriting that data. But the paper points out that log would need to be massive and not be stored in active memory due to that size. This is my understanding of how zip compression works. It also has no prices posted and if you have to ask…

 

Travis, I have seen software apps that I can use on desktops that will provide deduping via the app. But I have not seen anything that will do it on the fly in any other file system. If you know of something that will work on a FreeNAS box please let me know.

Link to comment
Share on other sites

Ben Reese It's mostly done at the file system level. They have different means of actually doing deduplication. ZFS is one of the most memory intensive dedup routines I can think of for example.

 

Doing a little snooping around this afternoon, looks like XFS doesn't offer dedup yet. Which makes me a little sad as it's one of the best file systems around.

Link to comment
Share on other sites

Travis, You are very much right. I looked at the FreeNAS user guide here (https://doc.freenas.org/11/storage.html#deduplication) and found that when you say "memory intensive" you are not joking around. It states that as much as 5 gigs of ram per TB of deduped data are required to avoid kernel panic. That could be disastrous.

Link to comment
Share on other sites

Travis Hershberger? I did some research, and Windows deduplication seems to be a Server 2012+ feature and works on NTFS and ReFS. It required adding the role under File Services then has to be enabled in Sever Manager (or PowerShell) for each drive. Just enabling it doesn't do everything though, it then runs a process on a schedule (default is 3 days and can be run manually).

 

I tested it on an iSCSI drive by copying the same 4GB file into 3 folders and into a 4th folder I combined that file with another to make a GB file. After running the process manually on the drive, I checked the total folder size. And.... No change. It was still using 16GB.

 

Some after thoughts.... Maybe the files I tested with were too big, or too new. Maybe it doesn't work with BitLocker (though if expect a warning at some point if so). Maybe it works better on a ReFS drive. Maybe I'm fundamentally misunderstanding how the process works. Regardless, I'd prefer to have this feature on my NAS than Windows Server.

 

Tl;Dr: Windows File Deduplication service didn't work in my test.

Link to comment
Share on other sites

 Share

×
×
  • Create New...