Site Vulnerability and Your Raw Access Log!
This is Old News that I didn’t knew about until I looked in my Raw Access Log today.
Vulnerable script in many WordPress Themes–it affects other blogging platforms as well: timthumb.php and more.
A short description:
TimThumb is a perl-script that crops and resizes images. It has a vulnerability that may allow unauthorized access to a blog.
Now, just because the script is on your site does not mean you have a malicious script there. The script itself has no malicious code. It is how it works that could be exploited to install programs in your site. More on the script and some discussions about being hacked in links at the bottom. There’s also a link to a script you can use to check if you have the vulnerable version.
How do I know someone tried to find this timthumb.php file?
Check your Raw Access Logs often for occurrences of attempted breach.
What you don’t want to find in this case: GET-Lines that include filenames such as: timthumb.php, thumb.php, check.php or uploadify.php.
Again: Check your Raw Access File often! It contains a lot of information you might not understand but it may hint you to something that’s going on which you would disagree with. Change the Log file settings in your cPanel to save daily. Download them when you’re doing housework and look inside them.
If you have no idea of what all that content is, do a search on Google and read up to know more. It may save you a whole bunch of trouble and money if you learn just a little bit about the topic now and then. The log files are your friends even if you don’t understand the content.
Take a deep breath and relax. It’s text files to begin with. There are no formatting or “binary” stuff in those files. Let’s take a look, I’ll explain.
Each line represent one single access from the outside world. It is written by the web server that hosts your web site. (Now, there might be differences depending on what the settings are for the web server at your host.)
Each line can be divided into 8 sections.
Here’s a line from my Raw Access Log log that alerted me about this timthumb.php stuff:
nnn.nn.nnn.nn - - [24/Mar/2012:15:25:45 -0500] "GET /wp-content/plugins/verve-meta-boxes/tools/timthumb.php HTTP/1.1" 404 10044 "-" "Mozila/4.0 (compatible; MSIE 6.0; Windows NT 5.1;)"
Let’s split it up in its sections like this:
- - -
- [24/Mar/2012:15:25:45 -0500]
- “GET /wp-content/plugins/verve-meta-boxes/tools/timthumb.php HTTP/1.1″
- “Mozila/4.0 (compatible; MSIE 6.0; Windows NT 5.1;)”
1.) First, there’s the IP address: nnn.nn.nnn.nn. In this case it was a dial-up address. I checked it up with some network tools, that’s why I know. It’s invalid pretty fast so I can’t easily find the origin. If I do, that computer might also have been hijacked. But the user records are stored at the ISP he or she used to access the Internet, so they’ve been notified. The part of the Raw Access Log that’s related to this is included. I don’t know if that lead to anything in Brazil but I hope it will burn the creep’s butt a lot. People can face jail time for this kind of things.
2.) The two dashes here indicates missing data. The first is the identity on the user running the client (the one that requests the document) according to a special RFC. The other is the identity according to HTML and the remote browsers environment. They are both typically unreliable.
3.) The date stamp. This is the date and time the line was added to the log. It’s the time where the hosting server lives. It may not be your time. It also has the offset to GMT.
4.) Between the first set of double quotes is the request which the web server running on your hosting site was asked to serve. The word GET is a command, and the rest is the path to the file that the “visitor” wants, plus the version of the protocol which is called HTTP/1.1. Now, this is interesting. If there’s a whole bunch of lines where there’s a request for something that does not exist on your website, there’s definitely something fishy going on. I did a Google search on timthumb.php and found the links below (among others.)
5.) Then there’s a number. It’s a status code; “404″ which simply converts to a “Page not found” error from the web server in this case. So long you have 404, it’s cool. It tells me that what they wanted to exploit can’t be found on the site. A “200″ means “OK”.
6.) This is the size of the content sent to the client (“visitor”). If nothing was sent it’s a dash. In this log entry it shows the size of the 404 (Page Not Found) document served.
7.) This indicates that this “visitor” wasn’t sent to my site through a link. It can be a direct access or a script that pick the address from a list of sites to examine for exploits. So if this was a visitor that clicked to my site from a blog post somewhere else it could have looked something like this: “http://someblog.somedomain.com/. ” Then I would have known that the “visitor” had been referred. The site providing the link is called referer and that is also the name for this section.
8.) The last chunk of text is the name/type of web reader and system who did the request. Or at least it’s what the web server was told. In this case it’s not to be trusted.
Try this for fun:
Type your own web site address in the address field in your browser. Then continue to type whatever you like, without spaces. It doesn’t matter what you write. Use this string as an example if you like:
Paste it right after the slash at the end of your domain. Then hit return and re-open the Raw Access Log. What you typed will be right there in the log. Last, within the first set of quotes. See point number 4 up there.
All your own accesses will also be in this file as well as all the robots and index gathering services around the world. You’ll see all the legit accesses there too. The Raw Access Log file is a tool. Learn what it contains and save your day. Backup your site often.
Important on Passwords
Change password at least once a month and don’t use the name of your dog or pet alligator. Don’t use expanded or changed dictionary words like spongedwater or batmancoveredmesh or the like either. Never, ever use names. A really good password contains of a minimum of 12 characters that doesn’t make any sense. They should look something like this: hFgy5d-95s-hDye4 and they should really extend to somewhere around 20 characters. This should be done for your cPanel (or whatever your hosting’s Administration Control Application is), Billing Account, and the Blog Admin Login page.
When you create a password, it’s not stored as plain text in your home directory. It is converted through a seriously complex formula into a scrambled word. It can’t be reversed, so if someone manage to get hold of that file, they get a scrambled password.
You may believe that, since it’s scrambled and can’t be reversed, you’re safe. But it is possible to find out your password through trial. Hackers use dictionary files, lists of names, lists of places, and lists on all sort of words people use to have as passwords. They run a program on a computer that picks a word from their gigantic lists, the program then converts it in the same way it is converted when you created your password, test the result against what they have managed to get from your site until they get a match.
Let’s assume your password was “Anaconda”. They would have a match in milliseconds as it starts with the letter “A”, and is a word found in any dictionary. When they have a match, they log in to your site and use it in any sick way they like.
And, of course, a script that checks for TimThumb vulnerability on your site. And yes, it’s safe. It’s under supervision of WordPress.