[Orca-users] Re: Help with scaling ORCA

Sun Feb 18 20:52:09 PST 2001

Hi John,

Impressive setup :)

The method you suggested of splitting the hostnames up by first letter
is similar to the one that I used at GeoCities and it works pretty
well.

Do you have orcallator.se compress the output pecol files?

Orca and all of its input data files start to run into problems when
you get into this many hosts.  There are several solutions:

1) After the percol files get loaded, move them into another location
   so that Orca cannot find them.  This should speed Orca up if you
   run it with the -o command line option, but I don't know how much
   it will help if you keep Orca running continuously.  There may be
   a problem with this if you move the data files away that contain
   old columns of data that are not in the newer data files and this
   data may not be plotted at all.

2) Modify orcallator.se to dynamically load RRDs.so and instead of
   writing all of the data to a single text file, have orcallator.se
   write a single RRD file for each measurement.  Then have Orca
   use the RRD files to know what types of plots to create and
   as the source of the data.

   This would solve a large number of scalability problems and I'm
   looking for something to take this on.  If there's a volunteer,
   contact me directly to work with the latest Orca build.

Regards,
Blair

John Mastin wrote:
> 
> We have a fairly large installation of ORCA (you would be proud,
> Blair).  Currently, we are monitoring 102 boxes and have over 3.5 GB
> of
> raw data.  We don't plan on stopping our growth either. :-)
> Unfortunately, all of this is behind our firewall otherwise I'd point
> you all to it.  Our layout looks like this:
> 
> On the boxes to be monitored (ORCA client):
> We have SE 3.1 preFCS installed with all patches.  We run
> orcallator.se
> and spool all of our raw data (percol files) into /var/orca/data.  We
> are also running an rsync daemon to service raw data file transfer
> requests.
> 
> ORCA server:
> This is an Ultra 10, 128MB ram and ~80GB disk.  Its sole purpose is to
> do ORCA.  It uses rsync to synchronize the raw data files from the
> clients every hour.  From there, it groks the data, puts it into rrd
> databases, builds webpages and plots.  There is a boa webserver on it
> and it serves up the ORCA pages directly.
> 
> We've run into the case where it takes longer than an hour to go
> through
> all of the data.  Originally, I had the rsync happen every ten minutes
> and ran orca -o out of cron, once an hour.  When it overran that, I
> scaled it back to once an hour for rsync and ran orca constantly (no
> -o
> option).  After a while, it couldn't keep up (again).  Last week, I
> maxed out the filesystem that held the raw data files.  Things were
> messed up so I decided it was time to reload from scratch and see if I
> could come up with a new way of getting orca run keep up.  I am
> entertaining the idea of splitting up my servers by hostname (regex in
> find_files) and running multiple orca processes.  orca process #1 will
> handle servers that begin with a-g.  Process #2 will run with h-p
> servers.  Process #3 will run with q-t servers and process #4 will run
> u-z servers.  I figure this way, I could distribute the load of
> keeping
> up.  I realize that it is all still being run on one processor.  I
> just
> wanted to see if I could exhaust my organizational options before I
> cry
> for additional hardware.
> 
> Anybody think this new plan is sane/insane?  Any other ideas on how to
> improve scaling?
> 
> Comments/questions welcomed!
> 
> Johnny
> --
> John Mastin, Jr.              email: john.mastin at bms.com
> Bristol-Myers Squibb            phone: (609) 818-3788
> PRI Infomatics              fax: (609) 818-7693
> Princeton  NJ
> 
>                    Yahoo! Groups Sponsor
> 
> 
>                           www. .com
> 
>