[Table] IamA splat, editor/moderator/reviewer on and sysadmin at a cancer research organization. AMA!

Verified? (This bot cannot verify AMAs just yet)
Date: 2013-08-13
Link to submission (Has self-text)
Questions Answers
This just got cross posted to /sysadmin ; as a fellow research-field oriented sysadmin it gets worse... I too started in the Quake/HL/CS/TF timeframe, but got my degree in CompSci. Have you ever dealt with mice (the mammal kind; I've got worse stories)? Certs: just got my RHCSA this year. I've got the RHCE scheduled for october, and I'm studying for the CCNA, though I use HP switches.
How do you backup desktops / servers? Backups: Luckily, I don't do desktop support. We have another IT group that does that, I'm completely independent from them and I only have to take care of servers (and my own desktop). The physical servers are backed up to tape with Bacula. Our virtual servers are backed up with Veeam. My own desktop is backed up to my NAS share using synctoy (yes, i use windows on my desktop).
How much disk space do you have in one server? One off systems: As in physical servers built by hand? 0. I'm pretty much a Fujitsu shop with a few Dells. I definitely don't have time to be piecing servers together. disk space: only a few TB per server. I think the better answer would be that we have an Isilon X200 cluster that is 140 TB.
one off systems: As in physical servers built by hand? More as in unique software; such as this computer runs the HPLC. I guess in that case I only manage a handful of physical servers and a few VMs that are made for running one special piece of software or analyze data from one piece of scientific equipment. We have many other scientific devices that are attached to PCs that are "community" devices, but I don't have to manage them. and we've got a microscopy group that is separate from me too, with their own machines and devices.
If you are moving to 1gbs are you looking to increase the MTU? I was working on that but had some issues with firewalls for my windows-putty users. First, just to clarify, we're going to 10G from the 1G we have right now. I'm not our main network guy, so I'm not entirely sure but I doubt we'll change the MTU simply because we don't have a remote site so the majority of our traffic is regular internet traffic.
As for our backend network, I do use jumbo frames on a couple VLANs for our storage.
That most important question for any or emacs? Vi improved.
Anand Shimpi and Dustin Sklavos had an interesting podcast on the merits of Haswell on the desktop. In short, Dustin echos the enthusiast community's frustration with overclocking headroom decreases from Sandy Bridge Ivy Bridge Haswell. It seems like IPC has gone up but maximum frequency has gone down so the ratio seems almost 1:1. Then there is the issue of the use of TIM and IHS glue cap that caused some to delid their CPUs (and void their warranties). Question 1: What are your thoughts on the overclocking headroom decreases that we've seen? Question 2: Is Intel doing enough to cater to the enthusiast community? Question 3: How do you feel about the delay in the release of Enthusiast parts by Intel (Sandy Bridge-E & Ivy Bridge-E) versus mainstream parts (Sandy, Ivy, and Haswell)? Intel makes good chips and they do keep pushing technology forward, but they will never do overclockers any favors. They will always be doing whatever they can to make money. AMD will also do the same thing. Intel seems to think enthusiast solely means "deep pockets". At the same time, there always seems to be a lot of "the sky is falling" reporting done by many tech journalists. Intel hasn't completely forgotten about overclockers and I don't think they ever will completely let that group disappear. And really, what incentive does Intel have to completely lock out overclockers? Sure, deny us our warranty, we'll go ahead and buy another chip and give you more money. How could you deny that as a company? as for overclocking headroom decreases, one can only hope that means we've got a whole new architecture coming out soon, something like the transition from Pentium 4 to Core.
Do you have a home lab setup to learn/test on? If so, what does it consist of? At home I've got a 1u dell poweredge sitting in a closet which is my main server. I run off it which was supposed to be my way of giving back to the community, by running a Linux torrent site. Other than that I've got two htpcs running Debian, a desktop windows machine for gaming/reviewing hardware, and a file server with 8 tb running Debian and KVM with a few Debian VMS.
Do you still have that site going? I tried your link but it didn't work. Looks like I let the SSL cert expire. I'll fix that tomorrow. It works on my end but I think I want to recode a few things and possibly get it to work with other trackers. Right now the torrents will only work with my local tracker.
Need to monitor that ;D. Yeah it's one of those things where I seem to be the only one visiting the site, so why stress about it. I also set up owncloud, but again, i'm the only one that uses it. :(
Do you get to keep the hardware you review? - Do you prefer the black theme or the white theme? Most of the time, yes.
Black. I don't mind the white theme that much tho. edit: he's asking about the forum default skin at
What is your #1 piece of advise for any linux sysadmin? That's a tough one. Do you mean someone looking to become a sysadmin or one that is already a sysadmin?
I guess I didn't specify that did I? I ask the question because I've been doing mostly Windows sysadmin duties for about 2 years and some linux admin stuff. I'm falling in love with Linux and I would love to have a job dedicated to just *nix What advise/suggestions would you give someone that is wanting to make the transition? I think what really got me the best knowledge was forcing myself to use a "less polished" distro as my main rig for a few years. Once you are forced to learn, you'll learn quickly. Picking up an rhcsa book will help too even if you don't plan on taking the exam. Go through it and do the exercises. Install a distro, set it up, then format and do it all over. You can use virtual box for the same result without killing your main rig.
Do you still use FreeBSD? If so, what exactly do you use it for now? No, but I wish I did. I stopped using it because the GPU support in Linux was better on my desktop, and now I work mostly with CentOS, and it would be a lot of work to change 100ish servers over to FreeBSD.
What did you use to train yourself in everything? Just break and fix? Pretty much just the experience of using it daily on my desktop for years. Running gentoo and Slackware really gets you used to doing things for yourself.
Configuration management of choice for those 18 servers? I'm just a jack of all trades sysadmin with a strong focus on problem solving. Are you trying to cure cancer with those 18 nodes or mining bitcoins? I started playing around with puppet but haven't really gotten the hang of it. Right now the cluster is running ROCKS with Grid Engine, and I just use the rocks commands to provision/wipe nodes.
What's the hardest part about getting started with puppet? I think its mostly just finding the time to sit down and have enough time to emerse myself in it.
700+ centos nodes across a few clusters here and I'm loving ansible. Nice. I've heard that ROCKS becomes a bear at scale, but for now it's pretty simple and quick. My plan is to keep adding another 18 nodes every year (one full blade cluster) every year, as long as I can get funding, so I'm keeping my eyes open for other solutions for provisioning. Bright cluster manager is another one I have on my radar.
Computer didn't work for 5 months (it started then after i downloaded skyrim from steam it shut off, then finally worked last month). Put my new graphics card in, then problems ensued. Here: Link to 1st step i'd do is remove all nonessential parts from the computer. Leave the cpu and 1 stick of ram. Pull out the graphics card, don't connect any hard drives or cd drives. On the back, connect the monitor to the on board video card and connect the keyboard. Does it power on? Do you get any error messages other than it saying there is no OS? Then power down and connect things one by one until you figure out what part is causing the problem. If you think it's the drivers, you can boot into safe mode (i hope windows 8 still has that, press f8 while booting), then run Driver Sweeper, to remove the graphics drivers. I haven't tried this on windows 8 so i'm not sure if it will run or not. I don't think you need to do a full format and reinstall.
I'll try this tomorrow after work for sure. Do you reddit enough that i could contact you for more advice for help if i run into anything else? (i did contact nvidia team for help, they just told me to delete old drivers without any other help then those words). I don't blame you if you don't want to say you are able to help me with this situation. Humans be humans. Was there a specific reason to go into a cancer research lab? Or was it just a job that came around? No I don't go into photoshopbattles. I pretty much just do what I need for websites and that's it.
How do you like your baked potatoes? (please get into specific detail). It just happened to be the job I found but I love the environment. Much different than a corporate job.
I'm not a fan of baked potatoes but I do love curly fries if that counts for something.
You should really join us in the BAPC IRC channel. I do hang out in the unofficial irc channel quite a bit. I'll try to drop by.
Do you do any sort of automation for firmware updates? Firmware automation? Nope, and I don't think I'd ever want such a thing. I've been looking at puppet as a way to automatically update software though.
I saw below you guys have some Dell servers, what models and do you use their Lifecycle Controller? We have a couple r610 servers and an equallogic storage box. I haven't heard of this life cycle controller.
What are the specs of your personal rig? Intel i7 3770k @ 4ghz.
Zalman CNPS9900LED cooler.
Patriot ddr3 2x2gb @ 800mhz cas7 (rated for 1200mhz cas9 but I can't boot at that speed anymore for some reason)
MSI Z77A-G41.
ATI Radeon HD 6870.
OCZ Revodrive X2.
How come you have a 3770k but only 4GB of RAM and a 6870? Seems a little overpowered in the CPU category. For benchmarking, mainly. The 3770k was our standard platform for reviews when I bought it. The rest is leftovers from various reviews. We don't get paid, so basically we work for hardware when we write reviews, more or less.
Wait when you review hardware you get stuff? Yes, hardware vendors provide review samples.
Have you ever had an OEM send you equipment different from the consumer version? (Say a factory overclocked version) and claiming it was the standard. Nope. Even if they did, we'd certainly review it as the hardware is, not as they intended it to be.
What's the worst PC loadout you've ever seen? PC load letter? What the f does that mean?
[email protected] JK, doesn't work well on a cluster unfortunately. Unless you have any perls of wisdom on how to make it work on a cluster? Well, it would work just as it does on any other group of computers. I'd have to run one client on each computer and they'd all check back to get their own workloads, so it would really take out the "cluster" usage and turn them just into regular blade servers.
How old are you? Young 30s.
Have you gone to college and completed a bachelor's degree, if not, do you regret it? Yes, BS in Mechanical Engineering.
How did you prove yourself to be worthy of that initial Jr. Sys. Admin job? I listed everything I could think of that I've done that was computesysadmin related. I had administered several web servers over the years, and experimented with many different distributions as my daily driver on my main desktop, so I was very comfortable on the command line and with day to day tasks. I was asked a few 'test' questions on the interview but I think they were more to gauge exactly what i did and didn't have experience with, not so much to make or break me.
Lastly, congrats on doing what you love for a living. Cheers to your future. And thanks. i definitely wake up in the morning with a different attitude than i used to, and that makes a big difference.
Configuration Management / Vagrant / Clouds. I have start playing with configuration management, but haven't gotten anything in production yet. I only provision new VMs every once in a while, and once the computer nodes are up they are pretty stable.
What is your scripting language of choice? I use straight up bash for most things, and python for some. I'm trying to learn more python.
How do you feel about some distros moving away from init.d and going to systemd? I like init.d because it's what I know. Systemd is just a different way of doing things, I'm sure I'll like it once I learn it.
As a OCF Member I have to ask, What is the most extreme cooling you have dealt with?(LN2, Phase Change, Water, D-Ice, etc.) LN2, at the benching party in philly last year. We definitely need to get one of those on schedule again. Also, my work has LN2 and D-ice sitting around but I haven't asked if it's ok for me to play with those yet. One day, i'll ask, and it will be awesome if they say yes. fingers crossed.
So, can I have some of your left over gear? Joking, heh heh... Seriously though, got any gear that's collecting dust? Mostly by the time we're ready to part with gear, it's not worth much and is terribly outdated. Or, it's been burned up by pushing too many volts.
What do you do with the old gear? Do you scrap up a functional computer and donate it to a charity, or just proper e-waste recycling? If it's not on my computer or benching station, it's in my closet. And my wife doesn't like the amount of computer stuff in my closet, so I'm sure I'll start looking for some way to recycle stuff soon.
Where does a young grasshopper starts to learn all of these materials wise one? Well, you could get yourself a RHCSA prep book (linked to the one i have and found useful) and go through all of the exercises. The way I learned was basically to set up my own servers, either physical or virtual, at home, and run them. I think FreeBSD, Gentoo, and Slackware were the most beneficial to me in that they don't really make choices for you, so you have to configure things for yourself which forces you to read the documentation and learn. They all have excellent documentation, btw. If you want to go a step further, linux from scratch will really teach you about the operating system from the ground up.
From there, come up with little projects for yourself. Like making a home NAS, setup NFS and Samba shares, install XBMC on a HTPC and hook it up to your tv to stream movies and music. Setup a webserver and owncloud. Stuff like that.
Sorry I'm late but... how old were you when you first starting tinkering with Linux and such? I'd like to be a sysadmin or similar when I finish school so I figured you were the right person to ask. I was 19 when I first made that half life/counterstrike server. I didn't even know what ssh was and it took a good amount of explaining for me to finally understand. The freebsd documentation is amazing and will walk you through just about everything step by step. To get NAT configured I had to use another how to but setting up that server taught me a ton.
Are you an Nvidia or an AMD guy? It's changed several times over the years. I used to be solely Nvidia because of Linux, but AMD has been stepping up their game and getting their drivers usable, so I currently run all AMD.
How much of a PITA is it for you to be HIPAA compliant? It's not really that tough. Luckily there's only a couple projects going on right now that have special needs above and beyond regular security needs.
What do you use for storage? We have a few Jetstor SANs, a couple Promise RAID boxes, and an Equallogic box as our VMWare backend. But our main mass storage is Isilon X200.
Whoops my bad, meant 1.18 not 1.8 it'd be gone if it was 1.8. sorry. I am using a hyper 212 EVO in the standard push configuration. Well 1.18 is too low for 4.4ghz.
Only 4gigs of ram in your rig ? Yeah...I've got 16 in my work PC for running VMs, and 16 in my VM host at home too. I'll probably buy more soon.
Oh ok, what V would I go to? I was able to initially get 4.4 with 1.18 and 0 whea errors, what V would you recommend? This is my first oc btw. Bump it up one step at a time until you are stable. Be methodical about it. You can check out what values other people are getting on
Ok Ill do that, thanks man, at what V if the errors dont go away should I stop advancing them? Most likely you will want to stay around 1.6v. I'm not very familiar with that chip specifically so I'd check hwbot to see what other people have posted and go by that. Obviously remember that not all chips are the same, so you can't expect to get exactly what other people get.
1.6, that seems a bit high for my 212 EVO, a few days ago I did have it at 1.18 without any WHEA 20 errors. That's why I'm saying take it slow, one step at a time.
What do you think of this quote by Richard J. Schwartz? "The impact of nanotechnology is expected to exceed the impact the electronics revolution has had on our lives." Sounds good to me. I can't wait to see what comes next.
Actually nodes, or are some of them VMs? Physical blade servers as nodes. with 144 GB ram each.
Zfsonlinux in use? No I haven't used zfs at all.
Hey... You're pretty cool. Thanks. You're not too bad yourself.
The answer should be ''i wish i could say the same to you'' I'm not like that.
Just how big is your hpc. Only 18 nodes :/ but its more what I do with it...
How'd you get your nickname. Back when I played CS in the dorm freshman year of college, I used to get killed all the time. So I started calling myself "jack splat", as a play on the nursery rhyme (jack sprat), then shortened it to 'splat' on most of the websites I signed up for.
Describe a SHTF moment at your work place. I can imagine it must be highly stressful being the sole responsible person to keep all that gear running. I definitely have a few and luckily they aren't that bad. One of my first few months, I decided to connect this wireless ap to the network to test it out one morning. As I was being awesome managing the cable to make it look clean, one of the security guards came into the server room and said they had no internet. I looked at our switches and they were all lit up solid. By hooking up the ap, which had spanning tree turned on, I took down the network of the entire building.
Ouch...that's definitely a SHTF moment. glad you came out unscathed. Luckily, all I had to do was unplug it and everything went back to normal. I then set up a spare switch at my desk and played with it before figuring out that STP needed to be disabled on the AP. Now it's been running for over a year without incident.
Would you rather fight 100 duck sized horses or 1 horse sized duck? I'd go for the horse sized duck. Seems like more of a challenge.
U mad? Nah, I'm feeling pretty good today.
