G+_Daniel Stagner Posted January 17, 2018 Share Posted January 17, 2018 What other file systems can use space like APFS from High Sierra. Please see the attached photo if you are wondering what I mean. The other day I started moving all my photo libraries to one hard drive in order to search them for photos of my brother. I wanted them in a local USB drive for ease and speed of access. I decided to place them on an apple JOBD RAID that I cobbled together from two 4 TB hard drives. I went with APFS knowing that APFS would likely be a good choice for the file system due to the extra copies of images files I was likely to generate. I was not expecting that APFS could use space so effectively. As I copied library after library to the raid it just stopped taking up space. This was due to the multiple libraries being backups of each other over the years. In any case what I want to know is there a file system available for FreeNAS or it's like that will do the same thing. Also is APFS using hashing to identify the duplicate files or something else. Link to comment Share on other sites More sharing options...
G+_Travis Hershberger Posted January 17, 2018 Share Posted January 17, 2018 I'm not sure how APFS is actually accomplishing it, but this same feature is available almost everywhere. What you lose is speed and using lots of ram. Sorry, but I can't remember details of the top of my head. Link to comment Share on other sites More sharing options...
G+_Seth Leedy Posted January 17, 2018 Share Posted January 17, 2018 Dedupication is the common name for this space saving feature. Link to comment Share on other sites More sharing options...
G+_Ben Reese Posted January 17, 2018 Share Posted January 17, 2018 EMC has an enterprise device (called "Data Domain") that does de-dupe on the byte level. Exceptionally useful for database backups where you might want to keep 30+ days worth of backup without actually consuming 30x the database's size. I'm not aware of any other consumer file systems that do this though. Definitely curious! Link to comment Share on other sites More sharing options...
G+_Daniel Stagner Posted January 18, 2018 Author Share Posted January 18, 2018 Ben, It looks like the EMC product is pretty sophisticated. The Stream-Informed Segment Layout (SISL) Scaling Architecture white paper did not help me understand how the deduping process works. I did get that its done in memory instead of on disk and it takes up less memory than other traditional processes. Well, I am not that smart and not well suited by my education for the type of information it provided. I also get that it uses checksums before after and during each data read wright session like ZFS. I figured it may just be keeping a log of the checksums and not rewriting that data. But the paper points out that log would need to be massive and not be stored in active memory due to that size. This is my understanding of how zip compression works. It also has no prices posted and if you have to ask… Travis, I have seen software apps that I can use on desktops that will provide deduping via the app. But I have not seen anything that will do it on the fly in any other file system. If you know of something that will work on a FreeNAS box please let me know. Link to comment Share on other sites More sharing options...
G+_Travis Hershberger Posted January 18, 2018 Share Posted January 18, 2018 Ben Reese It's mostly done at the file system level. They have different means of actually doing deduplication. ZFS is one of the most memory intensive dedup routines I can think of for example. Doing a little snooping around this afternoon, looks like XFS doesn't offer dedup yet. Which makes me a little sad as it's one of the best file systems around. Link to comment Share on other sites More sharing options...
G+_Ben Reese Posted January 18, 2018 Share Posted January 18, 2018 Anyone have experience with Microsoft ReFS? Quick search this morning lead to that as a possibility for this type of de-duplication. Link to comment Share on other sites More sharing options...
G+_Travis Hershberger Posted January 18, 2018 Share Posted January 18, 2018 Ben Reese I have a couple drives in using with at home. I'll try to see if I have the option to turn it on tonight. Link to comment Share on other sites More sharing options...
G+_Daniel Stagner Posted January 19, 2018 Author Share Posted January 19, 2018 Travis, You are very much right. I looked at the FreeNAS user guide here (https://doc.freenas.org/11/storage.html#deduplication) and found that when you say "memory intensive" you are not joking around. It states that as much as 5 gigs of ram per TB of deduped data are required to avoid kernel panic. That could be disastrous. Link to comment Share on other sites More sharing options...
G+_Joonas Tuomi (Jotu) Posted January 19, 2018 Share Posted January 19, 2018 Just an educated guess, but a hashing scheme is likely to be involved. For reference: btrfs.wiki.kernel.org - Deduplication - btrfs Wiki Link to comment Share on other sites More sharing options...
G+_Ben Reese Posted January 19, 2018 Share Posted January 19, 2018 Travis Hershberger? I did some research, and Windows deduplication seems to be a Server 2012+ feature and works on NTFS and ReFS. It required adding the role under File Services then has to be enabled in Sever Manager (or PowerShell) for each drive. Just enabling it doesn't do everything though, it then runs a process on a schedule (default is 3 days and can be run manually). I tested it on an iSCSI drive by copying the same 4GB file into 3 folders and into a 4th folder I combined that file with another to make a GB file. After running the process manually on the drive, I checked the total folder size. And.... No change. It was still using 16GB. Some after thoughts.... Maybe the files I tested with were too big, or too new. Maybe it doesn't work with BitLocker (though if expect a warning at some point if so). Maybe it works better on a ReFS drive. Maybe I'm fundamentally misunderstanding how the process works. Regardless, I'd prefer to have this feature on my NAS than Windows Server. Tl;Dr: Windows File Deduplication service didn't work in my test. Link to comment Share on other sites More sharing options...
Recommended Posts