[Orca-dev] Orcallator.se changes and additions

Dmitry Berezin dberezin at surfside.rutgers.edu
Tue May 18 09:29:54 PDT 2004


Blair,

> That would be great.  I think you should send me patches to fix and make
> the additions you want to.

Definitely, I just want to finalize and test them before sending.

> > 1. "Fix" timestamp.
> >
> >             Timestamps that orcallator.se uses for each data entry do
> > not perfectly lineup with the data gathering interval and could be off
> > by a couple of seconds.
> 
> Where do you see this happening.  On slow systems?  On large systems
> with many things to measure?

I see it on all of my systems. Say V1280 with 8 (or 4) CPUs and 10-15 HDs,
or E6500 with 10 CPUs and utilization ranging from 0% to over 50%.
It looks like the problem is in the sleep_till_and_count_new_processes
function:
sleep_till1 = now + 5;
This will make it sleep for 5 seconds even if it goes past "sleep_till"
value.

> >
> > I agree that "fixing" any data is not a very good idea, but since the
> > change in the value of each timestamp will be small, I do not think that
> > this will have any significant effect on data averaging.
> 
> Recent versions of orcallator.se try to avoid this problem to get now to
>   be exactly an integer multiple of the interval.  What version are you
> working with?

I work with version 1.37.

> 
> Also, are you using the web log processing code?  I could see this
> having an impact.

I'm seeing this problem on the systems where I do NOT use web log
processing.

> > 2. Problem with compressing data files.
> 
> Hmmm, I'd rather keep that still in the background.
> 
> How about adding a second test, where orcallator.se will check to see if
>   the compressed fileame exists, and if it does, it'll assume that the
> compression program is still running.

How about this: check if the new file name is the same as the previous file
name up to (but not including) the file number portion. If it is, increase
file number by one instead of setting it to 0. This will only add one string
compare operation and will also eliminate the need to "stat" all existing
files every time we need to switch log in the middle of the day. This will
have a minor side effect, though: existing log files will not be compressed.
But right now orcallator.se compresses existing files for the current day
only, and they all should be compressed already anyway. Besides, during the
first run (if we start orcallator.se manually in the middle of the day) it
will still compress all existing files for that day. It's becoming too
wordy... Well, I hope the main idea is clear.

> Patch welcome for this.

I will write the patch once we agree on the approach for this.

> > 3. Problem with proc_next().
> 
> Do you get the pr_fname member function on the proc_class_t?  As it is
> used here:
> 
>    for (p=first_proc(); p.pr_pid != -1; p=next_proc()) {
>      for (i=0; i<number_regexs; ++i) {
>        if (p.pr_fname =~ regexs[i]) {
>          ++count_procs_results[i];
>        }
>      }
>    }

Yes, but this is irrelevant - if I simply count all processes without
checking pr_fname, I sometimes get a very different results. I even wrote a
small program that runs two loops (first_proc/next_proc and pp.index),
counts the number of processes in each case and prints the results.
Occasionally, the results are very different (by 30%-50%). This does not
seem to depend on the interval at which the loops are run - every 5 seconds
or every 5 minutes do show the problem.
What would be interesting to see, if other people would be willing to run
this program on their systems to see if the same problem would reoccur. This
problem is 100% reproducible on my Solaris 8 box with SE 3.3.1.
Rich Pettit could not comment on this problem when I reported it to him,
since somebody else wrote the code for the proc class. 

> > 5. Process count.
> >
> >             Orcallator.se provides the ability for counting http and
> > https processes, but in a lot of cases there might be a need to count
> > some other types of processes (oracle database connections, for
> > example). It would probably be better to have a generic function that
> > would count "any" processes based on some initialization variables.
> >
> > Blair, is this of any interest. If yes, I could work on this. However,
> > see the next topic
> 
> Yes, a general way of doing this would be good.
> 
> > (I have also written a patch that counts Oracle database connections for
> > all Oracle databases, dynamically finding all instances; but now I am
> > not sure if this is of any value to anybody.)
> 
> There have been requests for Oracle data before, so I would say yes.

Well, this is not exactly Oracle data, since it only counts the processes.
The convenience of this patch, however, is that it finds all databases
dynamically, so in the large environments it would make it easy to deploy.

> > 6. Workload statistics.
> >
> > Blair, I can send this for review. Also, I need some help with graphing
> > the data.
> 
> Be glad to help.

After I posted this, I have received a copy of workollator.cfg from Justin
Buhler, which clarified some of my questions, but I will probably have more
later :-)

> After reviewing several of the patches, let's get you Subversion access
> so you can get these changes into Orca/orcallator.se.

I will send you all of these patches as soon as I finish testing them. 

  -Dmitry.





More information about the Orca-dev mailing list