Linux diagnostic software

One of the server under my supervision has started to experience problems since a few weeks ago. It has experienced several kernel Oops-es (equivalent to Windows’ BSOD I think), but sometimes it just crashed hard – no message whatsoever in the logfiles. This has me baffled for a while – I thought Fedora needed to be upgraded to the latest version at first. But then it was clear that even after updated with the latest updates, it’s still experiencing problems.

Somebody pointed out that memory should be the prime suspect at this case. So I ran memtest86, and true enough; it found hundreds of bad bits in the first 512MB.
Unfortunately, it is NOT possible to print out the error messages from memtest86, which will cause problem for me when I tried to return the memory module to the supplier. So I started to look around.
(note to self: recheck that these errors are not caused by wrong memory timing in BIOS)

Thankfully there’s memtester. I’ll give it a try probably tomorrow.

Along the way, I found several other relevant links:

[ An excellent guide on troubleshooting hardware problems on Linux ]
[ List of many diagnostic tools on Linux ]
[ Comprehensive list of tools and procedures for testing hardware on Linux ]

Hope you’ll find it useful.

19 thoughts on “Linux diagnostic software

  1. 1. If memtest says that there is error, it does not say that the DIMM is the culprit. Memory error can come from the DIMM, the socket, MMU controller or the processor calling the test, or the memory bus, or the wrong timing being used. So it is an oversimplifying case to claim that the DIMM is the culprit just because it fails the memtest. You have to do cross checking by carrying the DIMM to other running system or using memory checker hardware. This is where I love a big system, where they have multiple memory bank, multiple CPU, multiple MMU, ECC protected bus, and non-changeable memory setting driven by its SPD.
    2. If you really want to have a comprehensive console diag, then buy the system which is having a good console diag like the x86 server from IBM, Compaq or Sun. The console diag can be piped to another computer using telnet or serial console then you can capture the output.
    3. Try to avoid single DIMM situation, always spread to multiple DIMM, multiple bank. With this you also make a better interleaving

  2. Thanks for the comprehensive comments.

    1. This is absolutely correct and need to be kept in mind by anyone using memtest86 – even though in majority of the cases the fault will be in the DIMM module, there is still possibility that the error actually lies somewhere else.

  3. My Linux Experience
    As a long time Windows user it took quite a lot for me to finally decide to make the leap and give Linux a test drive. I thought that I had everything I need on my Windows systems, but at the same time I was slightly curious as to what Linux had to offer. Many of more technologically gifted friends spoke of the wonderful benefits of running an open source server but I was still hesitant. I wanted to put my own web server for basic home blogging and personal use. I was originally going to go with Windows but I had an old desktop lying around and really did not want to make any additional investment. After talking it over with my friends they eventually convinced me that running Ubuntu and Apache would be not only the easiest but by far the most cost effective way to get my server up and running.
    I went off and downloaded opensource version of Ubuntu and copied it off to CD. I was debating between Ubuntu and Kubuntu, RedHat, and Debian, but based on my friends advice decided that Ubuntu would be the best way to go. I inserted the CD and began the installation. Much to my surprise the initial format and setup was extremely easy. I thought that there would be problems with the recognition and installation of all my hardware components, but Ubuntu was able to easily install everything I needed, and actually much easier than my last Windows install. After creating my super user account and getting the operating system fully setup I was ready to start looking around. The first thing that I discovered was how seamless the Gnome GUI and the Bash command line worked together. While I was familiar with DOS and used it frequently, it was a mere infant when compared with the functionality of Bash. While the commands were definitely different, I was able to figure out most
    of the simple one very quickly. There is however many more to learn and I actually find myself looking forward to really digging in and learning my way around.
    Once I was done playing I wanted to focus on getting my web server up and running. Luckily Apache was easily found and installed. After I did that went through the Gnome GUI and did the basic configurations for the web server. I created a folder to house my page and copied in my index and all the other files. With the server up and running I entered my system IP address in a browser and the web page came up no problems. I wish I had listened to my friends months ago as this was way easier than I ever imagined it would be.

  4. interesting post, especially first comment. Evokes!

    Want to try memtest86 for prevention.

  5. Oh yeah I had a similar problem for a while too. First of all I thought that upgrading Fedora will help – but I was wrong. The problem was in memory – my programmer had written a memory checker so we had found the problem quickly. By the way thanks for the useful links, I will definitely bookmark them all!

  6. Firefox is getting slower and slower. I’m using Google chrome instead, because I find Firefox slower and it’s affecting my work as a content writer.

  7. Blu Ray Disc Copy is a professional and practical software which supports Blu ray movies copying as well as DVD copying. You can use this Blu Ray Disc Copy software to copy Bluray disc, Bluray folder, or Blueray ISO file without any quality loss. All the contents can be copied to the target file/disc quickly and perfectly. Blu Ray Disc Copy supports all Blu ray and DVD discs and it can remove AASC, CSS, and region protection from them. So that enjoying your Blu ray discs and DVDs freely and conveniently is no longer a problem for you.....

  8. Firefox is getting slower and slower. I’m using Google chrome instead, because I find Firefox slower and it’s affecting my work as a content writer

Leave a Reply

Your email address will not be published. Required fields are marked *