From blair at orcaware.com Wed Oct 24 23:40:13 2001 From: blair at orcaware.com (Blair Zajac) Date: Wed, 24 Oct 2001 23:40:13 -0700 Subject: [Orca-announce] Orcallator.se 1.32 released Message-ID: <3BD7B3CD.E0C925D2@orcaware.com> Orcallator.se 1.32 is released. It is available at http://www.orcaware.com/orca/pub/orcallator.se-1.32.txt This gathers new data which allows for some new plots that give a good health indication of all components of a system. A sample plot is online at http://www.orcaware.com/orca/docs/orcallator.html#system_overview This release also fixes some bugs in previous releases. This is the orcallator.se that will be released in the soon to be released Orca 0.27. Let me know if there are any problems with it. Below is a list of fixes from orcallator.se 1.23 which was included in the Orca 0.26 release. Blair -- Blair Zajac - Perl & sysadmin services for hire OS & web analysis - http://www.orcaware.com/orca/ // Version 1.32: Oct 24, 2001 Fix a problem where the web access log file // pointer instead of the file descriptor was // being passed to fstat(). Fix a problem where // the cached web access log stat() information // wasn't being erased if the log file was // successfully stat()ed but then fopen() failed. // Rename variables used to keep track of open // file pointers and file stat() information to // be clearer: ofile to out_log_fp, www_fd to // www_log_fp, www_stat to www_log_stat, www_ino // to www_log_ino and www_size to www_log_size. // Problem noted by Jeremy McCarty // . // Version 1.31: Oct 21, 2001 Instead of naming the output files percol-*, // name them orcallator-*. Always define // USE_RAWDISK to use the new raw disk code. // Previously, USE_RAWDISK was defined only if // WATCH_OS was defined, but if WATCH_DISK was // defined and WATCH_OS was not, then the new raw // disk code was not being used. This change // makes the behavior consistent. // Version 1.30: Oct 19, 2001 Rename the new State_* columns to state_*. // Version 1.30b2: Oct 12, 2001 Output eleven new columns named State_* where // each column represents numerically the state // of one of the system's substates as they appear // in the DNnsrkcmdit output. The character * is // replaced with the same character that appears // in the DNnsrkcmdit string to represent the // particular subsystem. This can be used to // create a single plot that shows how all of the // subsystems are performing. The mapping between // successive states is exponential, so that as // the subsystems get in worse conditions, the // plots will show higher values. Patch // contributed by Rusty Carruth // . Make all of the // live_rule.se live and temporary variable names // consistent. // Version 1.30b1: Oct 8, 2001 Changed method used by raw_disk_map to detect // the end of GLOBAL_disk_info to looking for the // first short disk name. This works for SCSI // disks and looking for fd or st devices which // should work for EIDE devices. Patch // contributed by Alan LeGrand // . // Version 1.29: Oct 5, 2001 In SE 3.2.1 stat.se, mknod is a C-preprocessor // define to _xmknod on x86 systems while on SPARC // systems stat.se declares mknod as a normal // function. When stat.se is included before // kstat.se on x86 systems the mknod define // causes a compile error on kstat's mknod // variables which are part of the ks_rfs_proc_v3 // and ks_rfs_req_v3 structures. The work around // is to include kstat.se before stat.se. // Version 1.28: Oct 2, 2001 No changes, bump version number to 1.28. // Version 1.28b7: Sep 29, 2001 Change the output log filename format from // percol-%Y-%m-%d to percol-%Y-%m-%d-XXX, where // XXX is a number starting at 0 that is // incremented anytime the number of output // columns changes or type of data stored in a // column changes. This is in addition to the // creation of a new log filename when a new day // starts. Whenever the program needs to create // a new log file for any reason, it will search // for the smallest XXX so that there are no log // files named percol-%Y-%m-%d-XXX{,.Z,.gz,.bz2}. // If the COMPRESSOR environmental is set and any // uncompressed files are found while looking for // the smallest XXX, they are compressed with the // COMPRESSOR command. // Version 1.28b6: Sep 28, 2001 Instead of outputting the number of CPUs only // when WATCH_MUTEX is defined, output it when // either WATCH_CPU or WATCH_MUTEX is defined. // Only declare and update tmp_mutex if // WATCH_MUTEX defined. // Version 1.28b5: Sep 28, 2001 Add three parameters that vmstat outputs, // #runque, vmstat's `r' column, which is the // number of processes in the run queue waiting // to run on a CPU, #waiting, vmstat's `b' column, // which is the number of processes blocked for // resources (I/O, paging), and #swpque, vmstat's // `w', the number of processes runnable but // swapped out. Increase MAX_COLUMNS from 512 to // 2048. Check [wr]lentime to see if an EMC disk // is using a fake disk for control. EMC disks // have a fake disk which commands are run over to // configure the disk array or to get stats from; // they are not real data transfers. They can // cause 1000 MB/sec writes to appear in the // stats. I still get them but not as often with // this bit of code in. If the I/O which occurred // in the last five minutes is not greater than // 1/100sec then it is not a valid measurement // anyway. What happens is that we can have a // small I/O, say 1024 bytes, in a 1/100sec = // 1024*100/sec. I am thinking of making it // wlentime+rlentime > 2 since I am still getting // fake write spikes. Make sure to define // HAVE_EMC_DISK_CONTROL to enable this check. // Patch contributed by Damon // Atkins . // Version 1.28b4: Mar 27, 2001 Recoded measure_disk() to access the RAWDISK // interface to sys_kstat device information to // allow the activity on Sun's A1000 and Clariion // Raid controller drives to be seen. Apparently // the pseudo drivers do not update the kstat // interface. It is also inverts the fix // provided by version 1.23 to avoid over-counting // md devices. By suppressing stats from slices // and metadevices and instead reporting on full // devices such as c0t0d0 or sd0. Note: This may // have introduced an interaction with the // live_rules.se class monitoring of drive // performance. Prevent floppy disks and tape // drives from RAWDISK. Added wio% to measure // wait time since the idle calculation is wrong // without this. Prevent filesystems mounted // under /snapshots from being seen. Patch // contributed by Alan LeGrand // . // Version 1.27: Mar 27, 2001 Print the portion of time running in idle mode // with some process waiting for block I/O as // wio% and otherwise completely idle time as // idle%. // Version 1.26: Feb 5, 2001 Make sure to check the return from stat() on // the web server access log in case the file is // missing. Use fstat() instead of stat() when a // file descriptor is available. // Version 1.25: Mar 30, 2000 Fix a typo where nil was misspelled as nik. // Version 1.24: Mar 25, 2000 When orcallator.se was running on a system // with an older version of SE the p_vmstat.scan // variable is an integer and the sprintf to // %8.3f fails, resulting in a perceived scan rate // of 0 pages per second. Now always add 0.0 to // p_vmstat.scan to get a double. // Version 1.23: Feb 25, 2000 When orcallator.se was running on a system // with DiskSuite, the same physical disk was // listed multiple times when it appeared in // several metadevices. The solution to the // problem is not to build the c0t0d0 name but // use the long disk name provided by the // long_name string. Patch contributed by Paul // Haldane .