[Orca-users] Orca using over 40% CPU on Server

B.Schopp at gmx.de B.Schopp at gmx.de
Fri Jun 20 09:01:34 PDT 2003


Hi Matt,

> However running on a Sun Solaris 8 system , Ultra 30 , UltraSPARC-II
> 296MHz CPU , 512Mb memory The Orca process seems to all ways use over
> 40% of the CPU The only reason it does not use 100% is other monitoring
> process (Nagios) need to run as well.
>
> At the moment I am only monitoring 53 clients.
> This is about to be stepped up to 300 clients.

The main problem we had with similar hardware was the disk-subsystem.
In the beginning we monitored about 30 hosts with only 2 disks in the orca-
server. This configuration resulted in an average of 35% wait-IO on the cpu.
As i managed it to increase the number of disks connected to the orca-server

to 6, the wait-IO decreased to an average of about 3%!!!

> Is the CPU going to go through the roof and grind the poor monitoring
> server to a halt !

I think, your main problem with a number of 50 monitored hosts on an U30 
will be the IO-bottleneck. Increasing performance here will have the best 
impact. An U30 won't be enough for 300 hosts. I guess, putting 300 clients 
to the U30 will result in a delay of at least 20-30 minutes.

> With all clients writing to the above via NFS , Note this is a local
> disk to the server , NFS mounted to the clients

Actually, i get all data to the orca-server using rsync over SSH. By 
monitoring 300 hosts you should use rsync as your monitored hosts will 
become slower if your orca-server crashes and the NFS-server is lost.

At the beginning of this year, i got the order to remove the orca-server
from  SUN-HW and migrate it to linux. I used a bunch of old workplace-
boxes to set up a little calculation farm for orca:
1 PIII/800 as File-/Web-/rsync-Server with 4 disk SW-RAID10
1 PIII/500 as Backup for the File-/Web-/rsync-Server
5 PIII/550 as calculation-engines for orca

The Fileserver has 3 NICs: 1 connected to the intranet, 1 connected to
the internal switch and 1 via crossover to the backupserver. Similar
like this the Backupserver is configured, except the disks: it has only
a 2 disk RAID1.
Data from the monitored hosts is collected from the fileserver via rsync 
over SSH. Internal data is shared via NFS. All calc-engines get and write
their data from/to the fileserver. Actually 3 calc-engines are in use and 
calculate the data for 25 clients each with a delay (to the 5 min. interval)
of about 2 minutes in the worst case. On the linux boxes runs percollator
which data is calculated on the fileserver to keep the calc-engines free 
for the solaris boxes.
My main problem is that i don't have an external JBOD to connect to the
fileserver to improve IO-performance as i experience slight throughput
problems on the disk-subsystem because of the large number of 
IO-ops/s. But if you have the right hw for a scenario like this, all orca-
graphs will be generated 'just in time' and its quite cost-effective ;-)

The big advantage of this config is its scalability: Just add more boxes
or increase the CPU-Power to calculate the data of more hosts. With 
actual CPU's one single-CPU-server should be enough to calculate the
data of about 100-150 hosts during a 5 minute interval. This numbers
are just a guess! Does anybody have experience with orca on linux and
actual CPU's???

One of the probs with this config is that you don't have one single entry-
page to orca: for each calc-engine there will be a entry-page with the
server-list. For our needs, this is good feature, as we are about to offer
orca-monitoring to other departments within the company and they will
get their own pages with restricted access. Maybe i'll get the JBOD if the
# of users and monitored hosts grows high enough ;-)

Best,
Burkhardt

-- 
+++ GMX - Mail, Messaging & more  http://www.gmx.net +++
Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!




More information about the Orca-users mailing list