[Orca-users] Old data / parsed data

David Michaels dragon at raytheon.com
Thu Sep 6 16:41:05 PDT 2007


Francisco Mauro Puente wrote:
> Thanks Michael, Dragon,
>
> Here is the situation:
>
> I've used to run orca on the Linux box (Pentium 4 2.8 / 512RAM) and at
> the time orca started to run, the disk activity put the system to it
> knees...
>
> Since I had all the servers scp'ng the files over to the Linux machine,
> I couldn't change it just like that, so I decided to leave the file
> where they are, but share the rrd and the html directories onto a Sun
> v490 for Orca to run and process remotely, and store the files on the
> NFS mounted directory. 
>
> While orca ran on the linux box, the disk and CPU activity caused a
> VERY high I/O. (There are some other things running on that box). I'm
> processing data for 30 servers, orca dies after some time here.
>
> Now that the files are being processed on the v490, I've managed, with
> this, to move the CPU time to the Sun box, but the disk are being
> accesed the same way, and the linux starts being useless...nothing else
> can be done on the linux at the time orca starts to run.
>   

Ah, I see.  Ideally you should change things so that the raw files are 
written (scp'ed) to the Sun instead.  If that's not an option, maybe you 
can change things so that the RRD files are written to one of the Sun's 
local drives instead of the Linux box.  That would help a bit.  Moving 
the HTML files to physically reside local to the Sun would also help.

I would also look at the linux's box's paging activity -- you might 
simply not have enough RAM, causing the system to do a lot of swapping 
(thus increasing I/O load tremendously).  Adding more RAM would help there.

Consider also looking at how you scp files -- if you are scp'ing all 
files all the time, that would kill your I/O over time.  Try adjusting 
it to only scp recent files (find . -mtime -2 -exec scp {} 
remotehost:/remote/path \: ).  Or perhaps use "rsync" or equivalent 
tools to distribute only new data.

> I'm in process of getting a new server, with 2 o 4 CPU's in order to
> run orca.
>   

If you haven't made a decision yet, you might want to avoid a Niagra box 
-- they're fantastic transaction machines, but not very good at floating 
point, which Orca does a lot of.  Niagra2s are much better, as are the 
UltraSPARCs and Intel clones.

> I'm using RICHPse-3.4.1, will update orca to r529  or later as you
> suggested
>
> I've attached one of my server's raw data directory, so you can see the
> size of the files
>   

Didn't see it -- it was probably too big for the mailing list.  Maybe 
you can just cut & paste a "du -sk *" for me?

> I know a simple 'find' will remove them but once the data is already
> generated, couldn't I just remove them all? same thing on the client
> side...right? I should keep only the files generated on the html
> directory right?
>   


I believe the common practice is to hold on to the raw data 
(compressed), as everything else can be derived from that.  If you lose 
your RRD files or your HTML files, for example, you can recreate them 
from the raw data files.  If you lose your raw data files, though, you 
can't sustain a loss of RRD or HTML anymore.  Also, you may encounter 
instances where you need to regenerate the RRD files or HTML files from 
scratch.  Without the raw data files, you lose this option, and that 
could be problematic down the road.

If you want to reduce space, perhaps archiving your raw data would be 
the way to go.  I wouldn't remove old raw data except as a last resort.

If you have a lot of change in your environment, comb through the raw, 
RRD, and HTML directories, and see if you can find directories for 
servers that no longer exist or that have changed in substantial ways -- 
remove those areas first, to help mitigate your space crunch.

And of course, consider altering the "orca" script and/or orcallator.se 
to record less data.

> I hope this information helps a bit more.
>   

Yes, very informative, thanks. :)

--Dragon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/orca-users/attachments/20070906/d750b9c8/attachment-0002.html>


More information about the Orca-users mailing list