Can a business grade client computer SSD run for 24/7?



  • I am looking at Kingston KC400 and Micron 1100 for pfSense' hard drive. (Of cause I will not worry with a data center grade SSD, but it costs more.)

    Both of them are business grade SSDs which were designed for client side desktop usages. They are using the same controllers and storage chips as consumer grade SSDs. Except of more TBW and longer warranty, the only difference from consumer grade I knew now are that the firmware has some kind of data protection features against power lost.

    I have a KC400 running in a pfSense box without snort/suricata. The read/write frequency is low comparing to desktop computer. SMART information looks good after more than a half year.

    But if KC400 or 1100 runs with generating a lot of logs(such as snort/suricata), log analyzing software, and web caching (squid and squidguard), will it still works well for 24/7? Anything else should I notice?



  • When you format leave half or most of the ssd empty.

    If 128GB ssd give pfSense 32 or 64 GB. This will spread out the wear all over the ssd.



  • use ramdisk


  • Netgate Administrator

    Any half descent current SSD should have no problems at all for years of even heavy logging. See any of the test results available. 100s of TB of writes are required to exhaust the flash write cycles.

    Most of the bad rep and fud that surrounds SSDs came from early/budget models with poor firmware. That simply doesn't apply to current devices.

    I would not expect the ware levelling to be affected by drive format either. It should distribute writes regardless of partition size.

    But, yeah, if you are worried just move /var to ram. You will lose logs at power failure though.

    Steve



  • @stephenw10:

    Any half descent current SSD should have no problems at all for years of even heavy logging. See any of the test results available. 100s of TB of writes are required to exhaust the flash write cycles.

    Most of the bad rep and fud that surrounds SSDs came from early/budget models with poor firmware. That simply doesn't apply to current devices.

    I would not expect the ware leveling to be affected by drive format either. It should distribute writes regardless of partition size.

    But, yeah, if you are worried just move /var to ram. You will lose logs at power failure though.

    Steve

    Its not partition size, its used space. If the partition limits you to using only 50% of the space, then the controller always has 50% free blocks to distribute writes to.

    I agree this is over kill with modern stuff, but if he is worried about endurance/performance degradation, wasting 1/3 or 1/2 of the space is a simple fix for that.

    You replied while I was typing a reply so I'm just pasting my comment in now with more thoughts and experience.

    I don't think ramdisk will be effective with snort/surricata squid and squid guard, and not shipping the logs.

    Ideally you could ram disk and ship all that data somewhere else, but flash based media had problems in earlier versions.
    Specifically for the uses you describe changes were made such that pfSense alone is generally very light on ssd or compact flash in terms of wear.
    I used early ssds and compact flash in alix boxes years ago, embedded install would make them read only and use ram disk for everything accept a few writes. This was to prevent the type of wear that kills flash media prematurely. This as a problem largely overcome by OS's being able to use TRIM etc, and controllers in flash media doing better wear leveling. However, changes were later made that allow mounting CF media as read/write, IIRC this change was pushed into normal updates, though I did it manually for a while before that. I've got Kingston compact flash that runs read/write in an alix for several years now. Performance sucks, but most the issues with flash media vs hdd are resolved for pfSense/FreeBSD and/or by better ssd controllers. write amplification and wear leveling the like.

    The main issue has to do with  individual parts of the SSD getting many many erase->write cycles. Controllers now are smart enough to write to a new empty part of disk such that each update to a log isn't causing erase/rewrite over and over to the same components.

    The ability of the controller to spread the wear is limited if all of the disk is in use. Hence, leaving half of the disk with no filesystem/slice/parition etc allows the controller to spread the wear among all that empty space. Your 50% will get spread around all the physical components of the media, and the controller will never be forced to write to the same place repeatedly.

    https://en.wikipedia.org/wiki/Write_amplification
    https://en.wikipedia.org/wiki/Wear_leveling

    This doesn't sound like a mission critical application, if so you need multiple boxes doing different roles and HA with CARP. So, I'd say the SSD will have plenty of endurance so long as you don't abuse the controller by filling the drive. Plus, if it fails it would probably be a situation where data can still be read, just not written, so clone off the disk to a spare and off you go.

    But you really should be shipping all the logs and data to some other place and viewing/aggregating/analyzing it there.
    As far as squidCache and normal logging, you should be fine.

    1. If seriously worried about the SSD do it the better way where logs are shipped to a log server even a different pfsense that doesn't run the network. Or simply install a mirror of standard sata disks or even a single disk to use for storing logs, and only use SSD for boot. 
    2. If not that serious, then you are still unlikely to have issues, but its not that serious, at least not serious enough to have a 2nd log server or hdd.
    3. If it IS that serious, but you are forced to do it all in one pfSense, well then you are forced to do it, stop worrying. Also you are still unlikely to have any issue using only the SSD.

    What I would use as a single SSD for a pfSense box. Instead of the Kingston.

    https://www.amazon.com/Intel-SSDSA2BZ100G3-100GB-Internal-Solid/dp/B00830IBVY/ref=sr_1_14?ie=UTF8&qid=1523219783&sr=8-14&keywords=high+endurance+MLC+SSD


  • Netgate Administrator

    Yeah I agree with most of that.

    Using a larger drive inherently gives more locations over which to level the wear. However I'm pretty sure anything recent will do that whether or not the locations are in use. I stand to be corrected though.

    Steve



  • Steve and lftiv,

    I agree most of your opinions.

    Yes, it is much easier to keep writing "safe" with current SSD controllers. I think the TBW specs of KC400/Micron 1100 mean "safe" in regular using of pfSense, snort/suricata logging and squid caching during 5 year period.

    And I may try to keep 1/4 to 1/3 spare space when partitioning a new SSD. Adding the spare space in root, /usr and /var which can be countable in pfSense, it is easy to keep around 1/2 space always free in this new disk.

    One of my concerns is whether Kingston KC400/Micron 1100 can be powered on and keep reading/writing for 24/7 for over 5 years. The WD black has 5-year warranty based on 8/5 for 5 years. And there is no MTBF but load/unload cycles on its spec sheet. (WD gold has both load/unload cycle and MTBF specifications.) KC400 spec has MTBF but no load/unload cycles. KC400 has 1 million hours MTBF which means over 100 years power on(For this question, the answer may be yes).

    By the way, only few SSD brands notice the business grade client desktop market of replacing traditional hard drives. As I know, Intel and Samsung have only consumer grade and data center grade products. In the traditional Marvell controller SSD brands, only Micron/Crucial has business grade. So I found Kingston KC400 and Micron 1100 only which is being sold by high rating stores in resellerratings.com discarding 3rd party sellers.



  • Shortest answer I can give is YES it will last.

    1. You don't have budget/need for enterprise grade or mirror or two pfsense.
    2. Given #1 the risk of failure is one you are willing to take.
    No one can tell you 100% it will last, even for an enterprise grade SSD, if you want something approaching 100% do a mirror or HA with two pfSense, etc. The fact that you are trying to rely on just this one SSD means that the SSD is reliable enough. Else, you'd be doing something better/expensive.

    There is no reason to believe it won't last the 5 years given what you've described as use. It might last 10. If you want more reliability, don't use a single ssd, or single pfsense.

    You want more in an answer then give, actual data in MB written per day, environment conditions, temp, power quality, what motherboard isit connected to, what chasis, case, fans? etc etc.

    Business Grade, vs Consumer, vs Data-center mean nothing. Company X consumer grade might be better than Company Y Datacenter grade.
    Ignore marketing terms that don't mean the same thing at each company. Celerons were Pentium chips from a lot where testing reveled  defects at greater than 10% or 15%. Cores get disabled, chips are under clocked etc. But in the chips were made in the same factory, only 80% of this group ran at rated speed, instead of 90%, but their is still an 80% chance that the 'Celeron' you bought can be clocked to the 'Pentium' speed with no problems. Hell it might run BETTER than the Pentium your buddy spent more money on. Many times its cheaper to manufacture all the same, do testing, then sell the ones that test better for more money under different label, or sell the ones that do worse for less under a different label with software mods that lower clock rate, or disable cores/components. Ever enable disabled cores on an AMD proc? Why would you sell a proc cores you never intend to be used?
    How may things can you buy labeled "Heavy Duty"? Are they always better than things you buy that don't say "Heavy Duty"?

    MTBF is not really useful metric in my opinion in general or for this topic. You aren't buying bulk, you want one drive to operate one device. MTBF may say X, but quality control matters, and you may simply get a bad ssd one that lowers the MTBF or you could get one that greatly exceeds it.

    I'd still go for 55$ enterprise grade Intel unless as I think the quality will be better, and in actual abuse tests it lasts longer than the Kingston.
    Also I've NEVER had an Intel SSD or even seen one in person in 10 years doing IT support, though it does happen according to google. I've seen multiple Kingston drives fail in env that weren't all that demanding. Same for crucial.



  • I've read a few different times over the years that SSDs last as long as mechanical drives for total data written. It just so happens that by the time you wear out an SSD's wear level, your mechanical drive will die do to being mechanical. It's less of a question of which one will last longer and more of do you want a mechanical drive that will die after 5-10 years due to mechanical wear or an SSD that will die after 5-10 years do to wear leveling, but is 10x-1000x faster and uses less power.

    Power loss failure modes are quite different. One thing to be aware of if your concerned about data loss.



  • I am looking at some 1u barebones of supermicro(with fans, not fanless). I think the power supply will not have problems.

    intel 710 is using a same controller as consumer grade 320 series and HET MLC which is much better than common MLC.

    By the way, intel 710 series looks old and was withdrawn from some high rating resellers. intel s3520 series is newer(released Q3 2016) and still there. Even it is slow.



  • On the budget ssd market, things have regressed reliability and endurance wise be careful there, although I expect ssd's marked for business use are not in that price sector.

    But as an example kingston latest 3 models for 60 gig capacity have regressions in random write, rated endurance (number of tbw for warranty expiry) and nand type.  (TLC vs MLC). Because of this I brought a batch of old ssd models of ebay as newer models are inferior, and the market is also flooded with unknown chinese brands as well now.



  • There is a fine line between "budget" and "crap".

    "Budget" is buying the cheapest thing that satisfies the requirements
    "Crap" is buying something that is so cheap that they forewent quality

    Not to say that something more expensive has higher quality.

    Techreport did an endurance test of SSDs and was able to write 2.4PiB to a 256 GiB TLC Samsung 840 drive before it suddenly failed, but it did get hit by a power failure shortly before and that may have contributed. And the 840s were known to have endurance issues. SSDs have gotten much better since, but no new endurance tests that I'm aware of.

    Even in the early days of SSDs when they actually did wear out, the RMA rate was 1/2 for SSDs than mech drives. Back in 2013, there was an article claiming reported failure rates for SSDs where 1/3 of mech drives. My biggest concern would be sudden power loss and how whatever SSD handles that. Some newer high density mech drives can lose committed data on power loss.



  • Ok,

    "I am looking at some 1u barebones of supermicro(with fans, not fanless). I think the power supply will not have problems.

    intel 710 is using a same controller as consumer grade 320 series and HET MLC which is much better than common MLC.

    By the way, intel 710 series looks old and was withdrawn from some high rating resellers. intel s3520 series is newer(released Q3 2016) and still there. Even it is slow."

    So your getting a rack mount server, but insist on using a single piece of media even though you can easily mirror with zfs and have good redundancy and monitoring of drive health.

    that 710 will out last anything you buy from Kingston. The controller matching the 320 I take as the 320 being good not the 710 being bad, unless I'm mistaken the Kingston you are looking at is standard MLC?

    You never mentioned speed, only reliability, if the 710 isn't fast enough, then that should be mentioned when asking how long its going to last. The main reason I linked it was its CHEAP the VALUE Quality/Cost is awesome, so buy 4 and get 100 GB x2 speed and some redundancy. Probably cost you about what the 1 Kingston will.

    Oh yea and remember when I said I'd never seen Intel's fail (even in Sewer inspection robots that ate lesser media), I've seen Kingston and I saw it again.
    So night before last I'm in a Datacenter racking a server for a customer(Occasional I don't own or manage their stuff), and one of the current tenants of the rack had angry flashing lights.
    The reason, I'll give you 3 guesses :)

    Attached to a p420 controller in an HP Proliant DL385 Gen 8 server. Running XenServer 6.5 with redundant PSUs being fed clean power, and perfect temps, an ideal environment physically.

    But combination of HP Raid controller and Xenserver 6.5 means the controller doesn't get to shuffle space like it should, TRIM not passed to drives, or must be manualy evoked from XC.

    And the most ironic part is I'm fairly sure this would not have happened if they had only allowed use of half the space, but the drive was near full, and the server not watched closely. After all next to new 512MB high quality SSDs why would they have a problem?
    I pulled drive did a secure erase and Kingston utilities said it had 90% life left, the drives were installed in 2016 less than two year.
    Beating on it with various utilities, drive is fine as far as I can tell. Performance is normal, can write TBs to it.

    Good luck with the build, super micro always makes nice stuff, and pfSense is glorious compared to Xenserver.

    ![2018-04-10 20_39_04 Kingstons Fail.png](/public/imported_attachments/1/2018-04-10 20_39_04 Kingstons Fail.png)
    ![2018-04-10 20_39_04 Kingstons Fail.png_thumb](/public/imported_attachments/1/2018-04-10 20_39_04 Kingstons Fail.png_thumb)



  • For clarity this was a 480GB KC300 same 'Business' class just one generation older than what you are looking at. Even billed as 'Enterprise' or at least its endurance and controller features were. Same MTBF, near identical endurance rating. Different controller though LSI doesn't make crap in my experience, maybe they just sell the good batches to other vendors?

    https://www.kingston.com/datasheets/skc300s3_us.pdf



  • @Harvy66:

    There is a fine line between "budget" and "crap".

    "Budget" is buying the cheapest thing that satisfies the requirements
    "Crap" is buying something that is so cheap that they forewent quality

    Not to say that something more expensive has higher quality.

    Techreport did an endurance test of SSDs and way able to write 2.4PiB to a 256 GiB TLC Samsung 840 drive before it suddenly failed, but it did get hit by a power failure shortly before and that may have contributed. And the 840s were known to have endurance issues. SSDs have gotten much better sense, but no new endurance tests that I'm aware of.

    Even in the early days of SSDs when they actually did wear out, the RMA rate was 1/2 for SSDs than mech drives. Back in 2013, there was an article claiming reported failure rates for SSDs where 1/3 of mech drives. My biggest concern would be sudden power loss and how whatever SSD handles that. Some newer high density mech drives can lose committed data on power loss.

    techreport tested drives (a) that are older generations, the 840 is 2 or 3 generations old now I believe.  and (b) in a different price bracket.

    The £30-£60 price bracket gets very little attention from the tech media, and the products coming out reflects that.

    The quality in terms of random performance and endurance in that price bracket has regressed significantly in the past few years.

    One of the reasons been is that the reputable brands have increased the minimum size of the drives they sell and since the price per gig hasnt gone down it effectively means they have abandoned that market. Try and find a 30 or 60 gig samsung 850/950 pro/evo e.g.

    The failure rate for some of these new drives is worse than those "first years", DOA has skyrocketed, and failure within 3 months is extremely high for some drives.



  • A $1000 budget but it need more space for logs and cache, more cpu cores for snort, and a little bit low heat and power usage. So passed XG-7100 and WD or Seagate's mechanical enterprise drives.
    (I did know each snort process can handle around 200Mbps and each core handles 1 snort process based on Security Onion's default setting.)

    KC300 has a SandForce controller which will have very low performance if the drive has very little space left(10% left?).
    In 3D NAND's age, whether TLC or MLC, consumer grade SSDs usually use Marvell and SMI as their controller, except Samsung using its own. Phison and SandForce are rare. For me, I will pass a MLC drive with SandForce or SMI controller.

    Micron has a document "Over-Provisioning the Micron® 1100 SSD for Data Center Applications" which also discussed leave some spare space to make Data Center applications become possible in a client SSD.

    By the way, I preferred high rating online stores with brand new products. So I am considering S3520, KC400 and Micron 1100 now. Of cause S3520 is a data center grade and more $$ per GB. 1100 is a TLC and lesser $$ per GB.



  • @chrcoluk:

    @Harvy66:

    There is a fine line between "budget" and "crap".

    "Budget" is buying the cheapest thing that satisfies the requirements
    "Crap" is buying something that is so cheap that they forewent quality

    Not to say that something more expensive has higher quality.

    Techreport did an endurance test of SSDs and way able to write 2.4PiB to a 256 GiB TLC Samsung 840 drive before it suddenly failed, but it did get hit by a power failure shortly before and that may have contributed. And the 840s were known to have endurance issues. SSDs have gotten much better sense, but no new endurance tests that I'm aware of.

    Even in the early days of SSDs when they actually did wear out, the RMA rate was 1/2 for SSDs than mech drives. Back in 2013, there was an article claiming reported failure rates for SSDs where 1/3 of mech drives. My biggest concern would be sudden power loss and how whatever SSD handles that. Some newer high density mech drives can lose committed data on power loss.

    techreport tested drives (a) that are older generations, the 840 is 2 or 3 generations old now I believe.  and (b) in a different price bracket.

    The £30-£60 price bracket gets very little attention from the tech media, and the products coming out reflects that.

    The quality in terms of random performance and endurance in that price bracket has regressed significantly in the past few years.

    One of the reasons been is that the reputable brands have increased the minimum size of the drives they sell and since the price per gig hasnt gone down it effectively means they have abandoned that market. Try and find a 30 or 60 gig samsung 850/950 pro/evo e.g.

    The failure rate for some of these new drives is worse than those "first years", DOA has skyrocketed, and failure within 3 months is extremely high for some drives.

    in a different price bracket.

    Than what? "Budget" does not mean "cheap", just "cheaper" or "sub optimal but still functional". I have a $130 "budget" NIC that I use for pfSense, because it's cheaper than a $500 NIC. I am using two $150 "budget" SSDs for pfSense, because they're cheaper than $500 SLC enterprise NICs. "Budget" means "cheaper", not "bottom of the barrel".

    Just because someone's budget cannot afford proper parts, does not make what they're asking for "budget" parts. They just want some knock-off that they can afford and works better than not working at all. They're just trying to minimize the damage. If someone wants a part that fits within a specific budget, they can post the budget or price range they're looking for. Otherwise you get people saying something like "I need a budget terabit router" and everyone be like "well, here's a motherboard with a 1gb realtek nic". It makes no sense.

    Can a "budget" SSD run 24/7 in a business environment? Sure it can. You just can't get a piece of crap. Can a $60 SSD? Probably not, at least not writing all the time. Can a $40 SSD? An SSD that is about the same price as a case fan shouldn't be used for anything.



  • budget generally refers to the cheapest products that a company makes.

    So e.g. if kingston make $30 ssds then their $100 ssds are not budget.  But I guess we dont see eye to eye on this.

    I have never ever before seen someone use the term in the way you have done such as look at these budget parts.  What you said I would phrase as look at these parts I am buying "within my budget".

    But yes the exact point I was making that $30-60 ssd's a few years back were capable of what you said they not capable off, which is 24/7 server type use.  Thats the point I made, in that newer parts in the same price bracket are of lower spec than the older ones.

    The $40 ssd in my pfsense unit has higher endurance than the $300 ssd in my PC.  Its only priced at $40 due to the size of the drive, in the year of its release the components were of equal quality to $400 drives made by the same manufacturer.