PDA

View Full Version : Anyone got any means of CPU/resource usage monitoring?


NeilFawcett
3-4-04, 10:47 AM
The best thing I've found so far is firing off a unix PS command. This only shows your current processes though. eg: www.loomforum.com/unix.php (http://www.loomforum.com/unix.php)

Ideally, a historical log like this would be ideal. ie: So you can see what your usage has been like over the past hour etc...

Anyone got any tips...

Pig
3-4-04, 11:31 AM
I do not know of anything existing (not that that means much), but if that script provides the information you need, why not just set it up to parse the data and put it into a database along with a timestamp. Then just set cron to run the script once an hour, or whatever.

NeilFawcett
3-4-04, 11:37 AM
Originally posted by Pig
I do not know of anything existing (not that that means much), but if that script provides the information you need, why not just set it up to parse the data and put it into a database along with a timestamp. Then just set cron to run the script once an hour, or whatever.

That's no good I'm afraid... When you run the PS or TOP command it's a snapshot of what is happening at that moment. ie: 2 minutes ago one of scripts may have been run by a visitor on your web that has only just finished running and has been using up 99% CPU :)

What you ideally need is something like the PS or TOP commands I demonstrate above, but which show EVERY run. ie: All historical ones, and not just currently active ones...

BerksWebGuy
3-4-04, 11:50 AM
I can't really imagine having every second of the day logged somewhere with the CPU usage.

If you really wanna get complicated. Create a script that runs every 1-2 seconds with the PS command. Write that into a flat file (or db)...that every minute keep the top 5 values...then every hour...get all those values..and keep the top 10...and so on and so on. Or you could average them out.

As long as you are logging the PS command...you can do alot with it.

NeilFawcett
3-4-04, 12:31 PM
Originally posted by BerksWebGuy
I can't really imagine having every second of the day logged somewhere with the CPU usage.

If you really wanna get complicated. Create a script that runs every 1-2 seconds with the PS command. Write that into a flat file (or db)...that every minute keep the top 5 values...then every hour...get all those values..and keep the top 10...and so on and so on. Or you could average them out.

As long as you are logging the PS command...you can do alot with it.

Just like every file access is logged (ie: we can look in out main root directory for that)... There surely must be something similar in Apache or Unix so you can see what executables were run when, and their resource usage etc.

It wouldn't be monitoring every second, simply creating a log entry for every process that starts/stops etc...


ps: The reason I want to do this is that Powweb have accused my scripts of using excessive CPU/resource. Something I don't know if I agree with if only because they've been running for years here so far. Hence me want to try and monitor and address and issues I do see. Anyway, would be great if I started running a PS command out to a batch file every second and that in itself made me a wanted man at Powweb for too much usage :)

WhyTanFox
3-4-04, 07:34 PM
a completly untested thought:

How long do you thing your script is running? 3 seconds? 5 seconds?

maybe do something like:

`( ps -aux >> psLog; sleep 1; ps -aux >> psLog; sleep1;{repeat as needed} ) & `


What I think that will do is run the ps/sleep/ps/sleep/etc line in a sub-shell, and the ampersand will send that process to the background so that the rest of your PHP script can continue along. So your PHP script is running while ps runs every second in the background and dumps to a log file.

Granted, this is just an idea... if the code above goes wild and eats up 50% of the CPU for 5 minutes, be ready with a `kill `ps auxw | grep ps`` or something like that :-)

gillcouto
3-4-04, 09:24 PM
On some unix versions the uptime command can list helpful info on accumulated resource use.

On PowWeb's FreeBSD it gives the average cpu load over the last 1,5, and 15 minutes. One of the admins might know if that's per user or system-wide.

There may also be some cpu cycle counters for each user on Apache, but I really don't know. It wouldn't be hard to keep track of that in C/C++ programs.

NeilFawcett
3-5-04, 05:24 AM
Originally posted by WhyTanFox
a completly untested thought:

How long do you thing your script is running? 3 seconds? 5 seconds?

maybe do something like:

`( ps -aux >> psLog; sleep 1; ps -aux >> psLog; sleep1;{repeat as needed} ) & `


I think this sounds the wrong way to go... Unless there is a proper automated logging system in Apache or Unix then anything such as above would be too inaccurate I recon.

Powweb sent me a list of my processes that had run in a period longer than an hour (very much like the ps -aux command), so I'm guessing such logs exist. The question is can we access them...

ps: My scripts are perl :)

NeilFawcett
3-5-04, 05:32 AM
Originally posted by gillcouto
On PowWeb's FreeBSD it gives the average cpu load over the last 1,5, and 15 minutes. One of the admins might know if that's per user or system-wide.

There may also be some cpu cycle counters for each user on Apache, but I really don't know. It wouldn't be hard to keep track of that in C/C++ programs.

What do you mean, FreeBSD? Wazat?

TO give you an idea of what I'm after, Powweb sent me the following log, in total (this is just a sample) it appeared to show the information regarding all my scripts run over a hour+.

myaccount 88837 0.0 0.0 10792 236 ?? S 2:11PM 0:00.45 /usr/bin/perl forum.pl
myaccount 88900 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 88960 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 88978 0.0 0.0 10820 236 ?? D 2:12PM 0:00.47 /usr/bin/perl forum.pl
myaccount 89388 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 89405 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 89585 0.0 0.0 10824 236 ?? S 2:13PM 0:00.44 /usr/bin/perl forum.pl
myaccount 89877 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 90043 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 90096 0.0 0.0 10808 236 ?? S 2:15PM 0:00.50 /usr/bin/perl forum.pl
myaccount 90231 0.0 0.0 3372 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 90332 0.0 0.0 3372 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 90915 0.0 0.0 2644 236 ?? S 2:17PM 0:02.31 /usr/bin/perl view.pl
myaccount 90931 0.0 0.0 2644 236 ?? S 2:17PM 0:02.21 /usr/bin/perl view.pl
myaccount 91022 0.0 0.0 2644 236 ?? S 2:18PM 0:02.16 /usr/bin/perl view.pl
myaccount 91028 0.0 0.0 2644 236 ?? S 2:18PM 0:02.14 /usr/bin/perl view.pl
myaccount 91171 0.0 0.0 10808 236 ?? S 2:18PM 0:00.68 /usr/bin/perl forum.pl
myaccount 91603 0.0 0.0 3364 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 91651 0.0 0.0 10788 236 ?? S 2:20PM 0:00.44 /usr/bin/perl forum.pl
myaccount 91853 0.0 0.0 14664 236 ?? S 2:21PM 0:00.57 /usr/bin/perl forum.pl
myaccount 91924 0.0 0.0 3372 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 92469 0.0 0.0 10796 240 ?? D 2:22PM 0:00.41 /usr/bin/perl forum.pl
myaccount 93551 0.0 0.0 3364 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 94078 0.0 0.0 3364 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 94226 0.0 0.0 3364 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 94265 0.0 0.0 3364 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 95110 0.0 0.0 3364 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 95792 0.0 0.0 3372 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 95913 0.0 0.0 3372 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 96050 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 96416 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 96438 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl


Ideally this is the kind of log/list I'm after...

PeterPan
3-5-04, 05:33 AM
Originally posted by NeilFawcett
I think this sounds the wrong way to go... Unless there is a proper automated logging system in Apache or Unix then anything such as above would be too inaccurate I recon.

Powweb sent me a list of my processes that had run in a period longer than an hour (very much like the ps -aux command), so I'm guessing such logs exist. The question is can we access them...

ps: My scripts are perl :)

Have you asked ?

Maybe they do keep the logs of EVERYONE - in 1 log - So they may not want you to view / list it...

- But - They may create a script to 'extract' only your logs out ?

P

NeilFawcett
3-5-04, 05:49 AM
Originally posted by PeterPan
Have you asked ?

Maybe they do keep the logs of EVERYONE - in 1 log - So they may not want you to view / list it...

- But - They may create a script to 'extract' only your logs out ?

P

* Performs calming chant before answering *

Yes, I've emailed numerous times. With that apparently not working I also phoned (spending over an hour in the queue) - Which from the UK probably cost more than a months hosting! I was also promised an email from "a programmer" to explain the matter.

The worse thing about this is this all comes from Powweb shutting my account down for a number of days due to "excessive CPU usage". My scripts have been running on the server for years so without some sort of explanation as which script(s) Powweb believe were at fault, and why, how can I possibly address the problem? I'm flying completely blind :(

So... Yes, I've asked... but unfortunately have not had any information back. I'm therefore hoping one of you kind folk can instead help and give me the necessary means to see/spot any issues myself :) I'm more than happy to keep an eye on what my scripts are doing etc...

PeterPan
3-5-04, 06:12 AM
Originally posted by NeilFawcett
So... Yes, I've asked... but unfortunately have not had any information back. I'm therefore hoping one of you kind folk can instead help and give me the necessary means to see/spot any issues myself :) I'm more than happy to keep an eye on what my scripts are doing etc...

Yep - I managed to fix my problem, but my http was removed (website) - for 3 days, while I tried to figure out which of my scripts was overloading the server..

As with you, they didnt tell me WHICH script was the cause.

P

PS - this was 4-5 months ago

NeilFawcett
3-5-04, 06:15 AM
Originally posted by PeterPan
Yep - I managed to fix my problem, but my http was removed (website) - for 3 days, while I tried to figure out which of my scripts was overloading the server..

As with you, they didnt tell me WHICH script was the cause.

P

PS - this was 4-5 months ago

How did you fix it then? ie: How did you know which to attack and what did you do to reduce its resource usage.

Again, my issue it my 4-5 scripts have all been running for years!

PeterPan
3-5-04, 06:27 AM
Originally posted by NeilFawcett
How did you fix it then? ie: How did you know which to attack and what did you do to reduce its resource usage.

Again, my issue it my 4-5 scripts have all been running for years!

If my memory serves me correctly, I think I had a perl script, with loops inside of loops etc...

Took a day or so to even locate the correct script (ive got 1 perl script & 15-20 modules)...

Once I figured out the script / module that was looping, I had to go through line by line - Re-editing & testing, upload, re-checking, upload, check, re-checking, upload check etc... Until the problem was fixed..

(by putting 'exit' commands at different stages, & removing the exit commands when I know a partticular section of script is OK..

COMPLEX, & FRUSTRATING :)(especially as the info Powweb gave me didnt tell me what was wrong.)

- I only knew the script was causing the CPU to overload because they told me that in an email (not in the report...)

P

NeilFawcett
3-5-04, 06:52 AM
Originally posted by PeterPan
If my memory serves me correctly, I think I had a perl script, with loops inside of loops etc...

Took a day or so to even locate the correct script (ive got 1 perl script & 15-20 modules)...

Once I figured out the script / module that was looping, I had to go through line by line - Re-editing & testing, upload, re-checking, upload, check, re-checking, upload check etc... Until the problem was fixed..

(by putting 'exit' commands at different stages, & removing the exit commands when I know a partticular section of script is OK..

COMPLEX, & FRUSTRATING :)(especially as the info Powweb gave me didnt tell me what was wrong.)

- I only knew the script was causing the CPU to overload because they told me that in an email (not in the report...)
P

So this was a new(ish) script you had introduced? My issue is all my scripts have been running for years without problem... I don't believe any of them have bugs in them where they are eating up loads of CPU :(

PeterPan
3-5-04, 07:06 AM
Originally posted by NeilFawcett
So this was a new(ish) script you had introduced? My issue is all my scripts have been running for years without problem... I don't believe any of them have bugs in them where they are eating up loads of CPU :(

What do your scripts do ?

I'll suggest installing a server, on your own PC, & running the scripts, with as much forced activity as you can - see how much CPU it uses.. (ie you use it often, & frequently..)

HOWEVER - it depends on the script, My scripts rely on a subdomain, & I couldnt / cant do that on my own PC...

So that wasnt a choice for me

P

NeilFawcett
3-5-04, 07:14 AM
Originally posted by PeterPan
What do your scripts do ?

I'll suggest installing a server, on your own PC, & running the scripts, with as much forced activity as you can - see how much CPU it uses.. (ie you use it often, & frequently..)

HOWEVER - it depends on the script, My scripts rely on a subdomain, & I couldnt / cant do that on my own PC...

So that wasnt a choice for me

P

Various things... Forum, art browsing, fiction browsing, competition entering...

I have them all on my machine at home as that's where I wrote/tested them all...

It's hard to emulate heavy usage. eg: 20 people using the forum at home. That's why I'd like to have commands/utils on Powweb to do it.

Again though, my scripts have been running trouble free for years at Powweb... Last Friday there was suddenly a problem with one of them for some reason?

WhyTanFox
3-5-04, 01:45 PM
Originally posted by NeilFawcett
What do you mean, FreeBSD? Wazat?

FreeBSD is the UNIX-based operating system* PowWeb uses. See www.freeBSD.org.

TO give you an idea of what I'm after, Powweb sent me the following log, in total (this is just a sample) it appeared to show the information regarding all my scripts run over a hour+.

myaccount 88837 0.0 0.0 10792 236 ?? S 2:11PM 0:00.45 /usr/bin/perl forum.pl
myaccount 88900 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 88960 0.0 0.0 3368 0 ?? IW - 0:00.00 /usr/bin/perl read.pl
myaccount 88978 0.0 0.0 10820 236 ?? D 2:12PM 0:00.47 /usr/bin/perl forum.p

<snip>


Ideally this is the kind of log/list I'm after...

Looks like ps output to me.

FWIW, on my end when a UNIX box seems to be running slow, I run ps to see what may be happening, and then go and kill processes from there. My point is that I don't think PowWeb keeps a log, they're probably just looking at the data when a problem arrises.

As for "my scripts normally ran fine for years": do they take any kind of user input? Somebody may have fed garbage (through HTTP POST & GET) to your script that it just couldn't handle very well, and was put into an endless loop. Are these your scripts, or part of a package you've installed? If the scripts were written by someone else, check the project's website for security or bug updates. If it's something you've programmed, start looking for edge conditions might cause inifinite loops, or an unexpected pause in the processing.

Just my 2¢; hope it helps a little.

* I'm trying to avoid pedantic arguments over what UNIX/BSD/Linux means. Linux looks like UNIX and quacks like UNIX, but was grown on it's own; BSD's have a tracable-to-UNIX heritage.