SRULECONV.PRO

SRULECONV.PRO

posted 02-20-2001 04:47 AM
Hi all,
The error matrix function is working, so now the program takes the S-plus rules and outputs classifyd rules, a key for DN=classname, Confusion matrices in pixel# and percentages, Producer’s and User’s Accuracies, and a Kappa statistic (quite easy once the darn pixel totals were made to work). Pretty slick.

I have also cleaned up the code from SPLUS2CLASSIFYD_NJM2.PRO (the ancestor of SRULECONV). The classifyd rules are now rounding the S-plus x.5 thresholds DOWN instead of UP.

…there is new messiness, though, in the new error matrix function (TRAINING_ROI_ERROR_MATRIX)

The program does not actually write the ASCII output files as of yet, but I presume that’s a trivial thing to add.

My next step would be to have the routine output the classifyd ctl file, but we’ll see when I get to that as I actually need to start using the programs rather than writing them.

Here’s the code. In the next post will be the input and output and link to my webpage where a sample Madadagascar classification was run.

Thanks, Nick

(Note my use of the code tags)

code:
;SRULECONV.PRO
;Version 2.0
;by Nick Matzke and Kerry Halligan, Jan.- Feb. 2001
;UCSB Geography Deparment
;
;The primary purpose of SRULECONV.PRO is to take the output of
;the S-plus binary-decision-tree algorithm (consisting of an ASCII text
;rules file) and produce an input rules file that the C-program
;CLASSIFYD (written by Steve Clester — and updated (??) and
;currently used by Dar Roberts) uses to apply a supervised decision
;tree classification to byte images such as Landsat TM.
;
;The secondary purposes of this program are to:
;
;1) Automatically produce a training-site confusion matrix and to
;calculate accuracy statistics based on the confusion information
;available in the S-plus ASCII rules output. The user can then review
;the confusion matrix before deciding whether to proceed with the
;classification or to modify training-site selection so as to
;improve the decision tree.

;1.5) This (and other) information will be displayed on
;screen and output to a metadata file (*.md) for the purpose of
;documenting the occasion, method, and results of the classification
;run.
;
;2) Automatically produce a control file (*.ctl) and runfile (*.exe)
;suitable for running CLASSIFYD with minimal effort. The files
;will also serve as further metadata about the classification run.
;
;3) Produce a confusion matrix for the “test-site” pixels (as opposed
;to training site pixels). Test-site pixels are not used in the
;generation of the classification tree, and are generated by
;the prior interpretation of “truth” on the image, or by selecting
;areas on the image that have been ground-truthed. The nature of
;supervised classifiers ensures that they will almost always work
;better on the dataset that trained them than on the rest of the
;image or the training data. A common practice is to select a
;number of “truth sites” (either based on ground observations or
;image interpretation) and then randomly choose half of them as
;training sites and the other half as test sites. This gives
;additional information about the accuracy of the decision-tree
;classification.
;
;
;Further desirable features:
;
;1) The goal of the program is to make all output files have the
;same prefix (e.g., manaus_cls1-5.rules, manaus_cls1-5.md,
;manaus_cls1-5.ctl, etc.). The classified image output
;by CLASSIFYD should have the same form (e.g., manaus_cls1-5.cls);
;This will be specified in advance by S+RULE_CONV control
;(*.ctl)-file output and contribute to the user’s ability to
;archive, recall, modify, and replicate classifications.
;
;2) The user may not want to use some of the above features and
;just use portions of the program for their own purposes (e.g.,
;if they have no test sites, or don’t have them in ENVI ROI format).
;These can be toggled on or off based on “y” or “n” variables in
;the code below. Users wishing to turn off some options should
;edit the code below and re-compile it (.r <program name> at the
;IDL prompt).
;
;(This gives the user flexibility without having to go through
;multiple tedious manual prompts, which I personally hate.)
;
;
;Inputs and outputs:
;
;With all features running, SRULECONV needs these inputs:
;
;1) For decision-tree rules conversion and generation of
;the confusion matrix for the training set, an ASCII file
;containing the actual rules generated by the S-plus binary-
;decision-tree algorithm. If the user has snipped the tree,
;the output tree rules should be used.
;
;In S-plus, the rules are printed to
;screen by simply typing the name of the tree object. They can
;then be pasted into an ASCII file using e.g. Windows Notepad
;with the WORD-WRAP option OFF. Alternatively, an S-plus script
;could write the rules to a text file.
;
;NOTE: As of version 2.0, an ASTERISK (*) MUST be manually added
;by hand to the end of the first line of the rules, one space
;from the preceding column as with the other asterisks. This
;will allow the IDL text-import widget to identify this last
;column (the asterisks tell the program which rules are leaves
;of a tree).
;
;2) To construct a control file, SRULECONV will need to get
;the path & name of the image to be classified, and the # lines
;and # samples. Presumably this could be gathered by PICKFILE-ing
;the image, or entered manually.
;
;3) To generate a test-site confusion matrix, the ENVI ROIs for
;the training sites must be available and have the same class
;names as the training site class names.
;
;
;And will generate these outputs:
;(the user will have to specify a filename for the rules file,
;and the program should carry this name through to all of the
;output files)
;
;e.g., if classification run name = manaus2
;
;CLASSIFYD rules file manaus2.rules
;CLASSIFYD control file manaus2.ctl
;Run file (runs classifyd) manaus2_run.exe
;Classification metadata file manaus2.md
;(contains error matrices, time & date, files used, etc., plus
;any comments added by the user)
;
;
;
;History of this program:
;
;The preceeding draft version was SPLUS2CLASSIFYD_NJM2.PRO.
;This version simply generated the CLASSIFYD rules and printed
;them to screen. Version 2.0 cleans up junk code and adds the
;auxilliary functions.
;
;This version was itself a modification of Kerry Halligan’s
;splus2classifyd.pro (Jan. 2001). It took advantage of
;Halligan’s text-import functions and resulting structure.
;It used a modified IMPORT_ASCII.pro routine to import the ASCII
;S-plus file.
;

;
;
;INSTRUCTIONS
;
;1) MAKE SURE that you have manually inserted an asterisk at the
;end of first line (starting with “1) root” of the Splus rules.
;
;1.5) Make sure to save the above change.
;
;2) Compile SRULECONV.PRO (“.r <filename>” at IDL prompt,
;CTRL-F5 in Windows)
;
;3) Type SRULECONV.PRO & hit return to run it (or hit F5 in
;Windows).
;
;4) The “Open File” widget should run first. Select the ASCII
;file containing the rules.
;
;5) The text import widget should run. Highlight the FIRST
;row of the actual rules (starts with “1) root” ) and hit NEXT.
;
;6) KEY STEP: because not all rows have asterisks, some rows are
;a column smaller. In the “Number of Fields Per Line” box,
;delete all numbers except the largest, which should equal the
;number of classes used in the training set + 8. Hit NEXT,
;FINISH.
;
;
;End comments. Happy classifying!
;==============================================================
;

;BE SURE TO MATCH FILENAME AND pro name (here)
pro sruleconv

FORWARD_FUNCTION IMPORT_ASCII_mod2, TRAINING_ROI_ERROR_MATRIX

;IMPORT SPACE-DELIMITED ASCII DATA
varname = strarr(1) ; (strarr = string array)
varname = call_function(‘IMPORT_ASCII_mod2’)
;Calling this function creates a structure in which every
;space-delimited column in the ASCII text rules is stored
;as a column in a structure (where each column can have a
;different data type).
;
;NOTE: There is currently some kind of problem with compiling
;this function on the Dept’s version of IDL for UNIX.

;Info obtained from the image:
;inimage = pickfile …
;use ENVI file query to get dimensions of image

;use pickfile to select the filename for the output image
;outimage = pickfile …

;get total number of fields
;optional /LENGTH keyword returns size of structure in bytes

;get field names
;optional /STRUCTURE_NAME keyword returns name of structure
;itself

ntags = N_TAGS(sname) ; N_TAGS gets info from the
; structure SNAME
tnames = TAG_NAMES(sname) ; Names are FIELD01, etc.
;NOTE: Fields should be referenced using tag numbers,
;example: ___.(tag#) instead of ___.___ to avoid naming problems

;Get the number of rows in the structure by using N_ELEMENTS on FIELD01
;This equals the number of rows in the output rules
nrows = N_ELEMENTS(sname.(0)) ; zero is the first column (FIELD01)

;CREATING ALL OF THE INTIALLY NEEDED ARRAYS
; These arrays will contain the information from parsing the structure

rownumber = fltarr(nrows) ; This will be filled with 1, 2,
; 3…nrows (1st column of output).
rowarray = fltarr(nrows) ; Initial splus row number
; (not consecutive)
vararray = fltarr(nrows) ; Refers to which band a decision
; tree branch will be split on
oparray = strarr(nrows) ; Operator: greater (>) or less than (<)

valarray = fltarr(nrows) ; Value array: is the pixel > or <
; than this DN value?
pixelnumarray = fltarr(nrows) ; Value array: Number of pixels in row

classnamearray = strarr(nrows) ; Stores the class name for conclusion
; (??) or majority of each row
asteriskarray = strarr(nrows) ; Stores the last field (asterisk or blank)

yesrulearray = bytarr(nrows) ; Stores the input for output column 4
; (i.e., if test = true, goto this row)
norulearray = bytarr(nrows) ; Stores the input for output column 5
; (i.e., if test = false, goto this row)

nclasses = ntags-8 ; Number of classes = (# struct. cols) – 8 =
; = ntags – 8

trainpercentarray = fltarr(nclasses,nclasses)
; A two dimensional array, #rows by #classes:
; For error matrix. This will have to be
; reduced to #classes by # classes, plus
; analysis variables (sums, etc.)

trainpixelsarray = intarr(nclasses,nclasses)
; Contains pixel totals for error matrix

rulesarray = bytarr(6, nrows) ; nrows in Splus rules, by 6 columns for output
; byte array (bytarr) Probably best at for
; CLASSIFYD, but should double check
; NOTE: THE # OF COLUMNS GOES FIRST, THEN ROWS

;FILLING UP ARRAYS FROM ASCII RULES STRUCTURE

;FIRST COLUMN OF SPLUS OUTPUT ( sname.(0)[i] ) — goes into rowarray

;For-loop: strip the ‘)’ off of the string in the first field and convert to float
for i=0,nrows-1 do begin
length2 = strlen(sname.(0)[i])
rowarray[i] = float(strmid(sname.(0)[i],0,length2-1))
endfor

temp1array=fltarr(nrows) ; temp array to shift rowarray up 1
for i=0,nrows-2 do begin
temp1array[i]=rowarray[i+1] ; Next row down moves up one
endfor
temp1array[nrows-1]=.5 ; Don’t repeat last row just yet
; rowarray[nrows-1] ; This would repeat last row
rowarray=temp1array

;SECOND AND THIRD COLUMNS OF SPLUS OUTPUT
;write the contents of the string array to the 3 temp arrays (vararray, etc.)
for i=1,nrows-1 do begin

;Find position of the operator at which string can be cut,
;splitting the variable number from the <> signs:
;op_pos (= operator position) will be 2 for single digit variable numbers
;op_pos will be 3 for double digit variable numbers

test = strpos(sname.(1)[i],’>’)
if (test ne -1) then begin
op_pos = strpos(sname.(1)[i],’>’)
endif else begin
op_pos = strpos(sname.(1)[i],'<‘)
endelse

;Find total string length using strlen
length = strlen(sname.(1)[i]) ;
;SECOND COLUMN OF SPLUS OUTPUT ( sname.(1)[i] ), part 1 — GOES INTO vararray
;Extract the variable string (part 1) and convert to byte format and put in vararray
if (op_pos eq 2) then begin ;eq means =
vararray[i] = float(strmid(sname.(1)[i],op_pos-1,1)) ; I think extracts 1 digit
endif else begin
vararray[i] = float(strmid(sname.(1)[i],op_pos-2,2)) ; I think extracts 2 digits
endelse

;extract the operator (part 2) and put it into oparray
oparray[i] = strmid(sname.(1)[i],op_pos,1)

;find the length of the value substring by subtracting the operator postion plus 1 from the total length
vallength = length – op_pos + 1

;extract the value string (part 3) and convert to byte and put in valarray
valarray[i] = float(strmid(sname.(1)[i],op_pos+1,vallength))
endfor

;Fill up ASTERISKARRAY from structure SNAME
i=0
; Fill in row 1,i=0 space, not asterisk
asteriskarray[i] = strmid(sname.(ntags-1)[i+1], 0, 1)

for i=1,nrows-1 do begin
;extract the asterisk and put it into asteriskarray
asteriskarray[i] = strmid(sname.(ntags-1)[i], 0, 1)
endfor

; FIRST COLUMN OF CLASSIFYD RULES OUTPUT:
; Use for-loop, float to start.

for i=0,nrows-1 do begin
rownumber[i] = float(i+1)
endfor

; Put byte version of ROWNUMBER into column 1 of RULESARRAY
; A little finagling to get rownumber into the first column only;
; …must be an easier way.

i=0
for j=0,nrows-1 do begin
rulesarray(i) = rownumber(j) ; RULESARRAY is byte
i=i+6
endfor

; SECOND COLUMN OF CLASSIFYD RULES OUTPUT: Variables (band #)
; NOTE: V2 in S+ = band 1 in ENVI = variable 0 in classifyd
; A little finagling to get rownumber into the first column only;
; …must be an easier way, but oh well.

i=1
for j=1,nrows-1 do begin
rulesarray(i) = (byte(vararray(j))-2)
i=i+6
endfor
rulesarray(i) = rulesarray(i-6)

; COLUMN 3:Thresholds
; NOTE: Classifyd needs round numbers.
; Dar says to round 0.5 DOWN; previous convention was to round up

i=2 ; i=2 starts at row 1, column 3
for j=1,nrows-1 do begin
rulesarray(i) = (byte(floor(valarray(j))))
; FLOOR truncates the decimal
i=i+6
endfor
rulesarray(i) = rulesarray(i-6)

; COLUMNS 4 & 5: C++ CLASSIFYD program rules, (aka the tough part)
; NOTE: CLASSIFYD needs round numbers. Current convention is to round
; 0.5 down, then use >= result.

;FILLING TEMP RULE ARRAYS (for output columns 4 & 5)
;Kerry’s advice on the function WHERE: use WHERE to find index of row that
;contains a given number in the first field, e.g. row# = where(sname.(0) eq ‘#’)

for i=0, nrows-2 do begin
;First, test for leaf (denoted by “*” in S+ ASCII rules
if (asteriskarray[i] eq ‘*’) then begin
yesrulearray[i] = 0
if ( ((rowarray[i]) – 1) eq (rowarray[i-1]) ) then begin

norulearray[i] = i+2
endif else begin
norulearray[i] = 0 ;was set to 250 for test purposes
endelse

endif else begin
yesrulearray[i] = i+2
temp_wherearray = rowarray
no_then_whererow = where( ( rowarray eq (rowarray[i])+1), count )
no_then_whererow = no_then_whererow+2 ; Fudge factor. Deal.

norulearray[i] = no_then_whererow

if (count ne 1) then begin ; Error trap — e.g., wrong input
print,’Messed up decision tree, check variable ROWARRAY for identical values’
print,’Count of identical rowarray values = ‘,count
;TEST PRINT OF rulesarray
print,rulesarray
stop
end
endelse
endfor

;FILL IN RULES FOR OUTPUT COLS 4 & 5:
i=3
for j=0,nrows-1 do begin
rulesarray(i) = (byte(round (yesrulearray(j)) ))
rulesarray(i+1) = (byte(round (norulearray(j)) ))
i=i+6
endfor

;Fill in last row with copy of 2nd to last row
for i=( (nrows*6)-5), ( (nrows*6)-2) do begin
rulesarray(i) = rulesarray(i-6)
endfor
rulesarray((nrows*6)-2) = 0 ;The last row must be 0 in col. 5

;COLUMN 6: Class ID number (this is the DN that will correspond
; to each class in the output image).
;
;There are several logical ways to assign numbers to classes:
; Option #1: The user names their ROIs with numbers in the first place
; Option #2: The numbers are assigned to the named classes alphabetically
; Option #3: The numbers are assigned in the order that the classes
; are first reached in the decision tree
;***USER DEFINED TOGGLE***
;Which option the user prefers will be set by this toggle in the code:

classnamenumberoption = 3 ; e.g., 3 corresponds to option 3.

print,’Current choice for DN assignments to class names = Option #’,classnamenumberoption
print,'(this can be changed by changing the option number for ‘
print,’CLASSNAMENUMBEROPTION in the code)’
print

print,’Number of classes = ‘,ntags-8
;There are 8 obligate columns in the rules (Eight space-delimited
;fields in the S+ rules are always non-proportion information)
print,’Number of rows in input S-plus ASCII rules = ‘,nrows

;Setup
for i=0,nrows-1 do begin
classnamearray(i) = sname.(4)[i]
endfor

junkvalue = where(asteriskarray eq ‘*’, count) ;Leave in to get # of *’s
leafcount=count
print,’Asterisk/leaf count = ‘,leafcount
print

;Get only leaves for numbering scene
j=0
leafnamearray=strarr(count)

for i=0,nrows-1 do begin
if(asteriskarray(i) eq ‘*’) then begin
leafnamearray(j) = classnamearray(i) ;A cell for each asterisk
j=j+1
endif else begin
endelse
endfor

;Create array for class names
;nrows known from splus rules

alphanamearray= strarr(nclasses) ;Temp array listing classes

alphanamearray=leafnamearray(UNIQ(leafnamearray, SORT(leafnamearray)))
; Use leafnamearray to get #’s in order w/ leaf classes
; Classnames are now in alphabetical order

;The chosen option will be implemented via CASE statements:
case classnamenumberoption of

1:begin
print,’Case 1: Your class names are the desired classified image DNs.’

; This works the same as case 2, actually. See below.

singlenamearray = strarr(nclasses)
for i=0,nclasses-1 do begin
singlenamearray(i)=alphanamearray(i)
endfor
end

2:begin
print,’Case 2: Your class names will be numbered alphabetically.’
; just use alphanamearray order for classification
; (should be alphabetical)

singlenamearray = strarr(nclasses)
for i=0,nclasses-1 do begin
singlenamearray(i)=alphanamearray(i)
endfor
end

3:begin
print,’Case 3 is running: Your class names will be numbered in the’
print,’order they first appear in a leaf in the S-plus decision tree.’

; The trick here is to convert text names into numbers, output an translation/metadata file, e.g.:
; DN Class name
; 1 Burn
; 2 Pfor
; 3 Sfor …etc.

; Get classnames in leaf appearance order (i.e., first classified leaf class gets DN #1

j=0
classorder=bytarr(nclasses); = position of class names in leafnamearray
classnums=fltarr(count) ; classnums = class numbers

for i=0,nclasses-1 do begin
test=0
for j=0,leafcount-1 do begin
if( (alphanamearray(i) eq leafnamearray(j)) AND (test ne 1)) then begin
classorder(i)=j
test=1
endif else begin
endelse
endfor
endfor

;for i=0,nclasses-1 do begin
; print,’alphanamearray: ‘,alphanamearray(i),’ order: ‘,classorder(i)
;endfor

singlenameorder = sort(classorder)
;print,singlenameorder

singlenamearray = strarr(nclasses)
for i=0,nclasses-1 do begin
singlenamearray(i)=alphanamearray(singlenameorder(i))
endfor

;print,singlenamearray
end

endcase

print
print
print,’KEY FOR CLASSIFIED IMAGE DN VALUES’
;print,'(First leaf reached = 1, etc.)’
print,’===========================’
print,’ DN Class name (from ROIs)’
print,’===========================’
for i=0, nclasses-1 do begin
print,i+1,’ ‘,singlenamearray(i)
endfor
print,’===========================’
print
print

;FILL IN LAST COLUMN (COLUMN 6) WITH CORRESPONDING CLASS DNs
classnumberarray=bytarr(nrows)
for i=0,nrows-1 do begin
classnumberarray(i) = 1+where(classnamearray(i) eq singlenamearray, count)
endfor

i=5
for j=0,nrows-1 do begin
if(asteriskarray(j) eq ‘*’) then begin
rulesarray(i) = classnumberarray(j)
endif else begin
rulesarray(i) = 0
endelse
i=i+6
endfor

print
print,’Classifyd rules:’
print
print,rulesarray

;SECONDARY FUNCTIONS
;#1: PRODUCE CONFUSION MATRIX FROM S-PLUS ASCII RULES
cmatrix_bypixel_tr = bytarr(nclasses,nclasses) ; Will contain pixels in matrix
cmatrix_bypixel_tr = bytarr(nclasses,nclasses) ; Will contain percentages

asdf=bytarr(1)
;asdf =call_function(
asdf=TRAINING_ROI_ERROR_MATRIX(sname,alphanamearray,asteriskarray,$
pixelnumarray,trainpercentarray,leafcount,nrows,nclasses,classnamearray,trainpixelsarray)

;

;varname = call_function(‘IMPORT_ASCII_mod2’)

end

;Kerry Halligan figured out and modified this function. It is undetermined if it will
;work in older versions of IDL than 5.4.
;*** function below is almost exact duplicate of IMPORT_ASCII.pro, the modifications are noted
; as comments. The function it is the widget routine
;that formats the input ascii file and returns the resulting structure as a variable called ‘sname’
;
; $Id: import_ascii.pro,v 1.8 2000/07/14 16:35:13 chris Exp $
; Copyright (c) 1999-2000, Research Systems, Inc. All rights reserved.
; Unauthorized reproduction prohibited.
; NAME: IMPORT_ASCII
; PURPOSE: This routine is a macro allowing the user to read in an ASCII
; file and have the contents placed in the current scope as a
; structure variable.
; CATEGORY: Input/Output
; CALLING SEQUENCE: IMPORT_ASCII
; OUTPUTS: This procedure creates a structure variable and places it in the current scope. The variable is named ‘filename_ascii’ where filename is the main part of the file’s name not using the extension.
; EXAMPLE: IMPORT_ASCII
; MODIFICATION HISTORY: Written by: Scott Lasica, July, 1999
; Modified: CT, RSI, July 2000: moved varName out to IMPORT_CREATE_VARNAME
;
;original: pro IMPORT_ASCII
function IMPORT_ASCII_mod2
COMPILE_OPT hidden, strictarr
catch,error_status
if (error_status ne 0) then begin
dummy = DIALOG_MESSAGE(!ERROR_STATE.msg, /ERROR, $
TITLE=’Import_Ascii Error’)
return, title
endif
filename=DIALOG_PICKFILE(TITLE=’Select ASCII Splus rules file to read.’,/READ,$
FILTER=’*.*’,/MUST_EXIST, GET_PATH=gp)
if (filename eq ”) then return,title
templ = ASCII_TEMPLATE(filename, CANCEL=cancel)
if (cancel) then return,title
tempStr = READ_ASCII(filename, TEMPLATE=templ)
;; Store the return variable into a var for the user
varName = IMPORT_CREATE_VARNAME(filename, gp, ‘_ascii’)
;original: void = ROUTINE_NAMES(varName, STORE=ROUTINE_NAMES(/LEVEL)-1, tempStr)
;modified line:
void = ROUTINE_NAMES(‘sname’, STORE=ROUTINE_NAMES(/LEVEL)-1, tempStr)
;line below added
return, varName
end

function TRAINING_ROI_ERROR_MATRIX,sname,alphanamearray,asteriskarray,$
pixelnumarray,trainpercentarray,leafcount,nrows,nclasses,classnamearray,trainpixelsarray
print
print,’…running the training-site confusion-matrix function, TRAINING_ROI_ERROR_MATRIX’
print
;Fill in pixelnumarray
for i=0,nrows-1 do begin
pixelnumarray[i] = float(sname.(2)[i])
endfor
; print,pixelnumarray

;Create & fill in temp array holding accuracy information
rawmatrixinfoarray = fltarr(nclasses,nrows)
j=0
i=0
for h=0,nclasses-1 do begin
j=h

for i=0,(nrows-1) do begin
rawmatrixinfoarray[j] = float(sname.(h+6)[i])
;print,’h=’,h,’ j=’,j,’ i=’,i,’ sname=’,(sname.(6+h)[i])
j=j+nclasses
endfor
endfor
; print,rawmatrixinfoarray

;Sort array by asterisk:
asteriskorder=bytarr(leafcount) ; Will contain the index of
; asterisk names in leafnamearray
asteriskorder = sort(asteriskarray)
; print,asteriskorder
asteriskorder = reverse(asteriskorder)

sortedasterisks = strarr(leafcount)
for i=0,leafcount-1 do begin
sortedasterisks(i)=asteriskarray(asteriskorder(i))
endfor

; print,sortedasterisks

matrixinfoarray=fltarr(nclasses,leafcount); Will get the pixel info for just leaves
j=0 ;j is col#
i=0 ;i is row#
for i=0,(leafcount-1) do begin
rowindex = bytarr(nclasses*leafcount)
rowindex=(asteriskorder(i)*nclasses)
for j=0,nclasses-1 do begin
matrixinfoarray(i*(nclasses)+j)=rawmatrixinfoarray(rowindex+j)
endfor
endfor
; print, matrixinfoarray

;Sort the class names by asterisk
temp_sortedclassnamearray=strarr(leafcount)
for i=0,leafcount-1 do begin
temp_sortedclassnamearray(i)=classnamearray(asteriskorder(i))
endfor
; print,temp_sortedclassnamearray

;Sort the result alphabetically
sortedclassnamearray=strarr(leafcount)
nameorder=bytarr(leafcount)
nameorder=sort(temp_sortedclassnamearray)
sortedclassnamearray=temp_sortedclassnamearray(nameorder)

;Sort MATRIXINFOARRAY alphabetically by class (INDEX):
j=0 ;j is col#
i=0 ;i is row#
temp_matrixinfoarray=fltarr(nclasses,leafcount)
for i=0,(leafcount-1) do begin
rowindex = bytarr(nclasses,leafcount)
rowindex(i) = (nameorder(i)*nclasses)
for j=0,nclasses-1 do begin
; print,’nameorder=’,nameorder(i),’ i=’,i,’ j=’,j,’ nclasses=’,nclasses,’ rowindex=’,rowindex(i)
temp_matrixinfoarray((i*nclasses)+j)=matrixinfoarray(rowindex(i)+j)
endfor
endfor

; print,temp_matrixinfoarray
matrixinfoarray=temp_matrixinfoarray

;Sort the pixel totals by asterisk
temp_sortedpixelnumarray=fltarr(leafcount)
for i=0,leafcount-1 do begin
temp_sortedpixelnumarray(i)=pixelnumarray(asteriskorder(i))
endfor
; print,temp_sortedpixelnumarray

;Sort the result alphabetically
sortedpixelnumarray=strarr(leafcount)
sortedpixelnumarray=temp_sortedpixelnumarray(nameorder)
; print,sortedpixelnumarray

; j=0
; for i=0,leafcount-1 do begin
; print,sortedclassnamearray(i),’ ‘,matrixinfoarray(j),$
; ‘ ‘,matrixinfoarray(j+1),’ ‘,matrixinfoarray(j+2),’ ‘,matrixinfoarray(j+3),$
; ‘ ‘,matrixinfoarray(j+4),’ ‘,matrixinfoarray(j+5),’ ‘,sortedpixelnumarray(i)
; j=j+nclasses
; endfor

;combine the numerical matrices
totalsarray=fltarr(nclasses+1,leafcount)

j=0 ;j is col#
i=0 ;i is row#
for i=0,(leafcount-1) do begin
for j=0,nclasses-1 do begin
; print,’nameorder=’,nameorder(i),’ i=’,i,’ j=’,j,’ nclasses=’,nclasses,’ rowindex=’,rowindex(i)
totalsarray((i*(nclasses+1))+j)=round((matrixinfoarray((i*(nclasses))+j)*sortedpixelnumarray(i)))
endfor
for k=0,(leafcount-1) do begin
totalsarray((k*(nclasses+1))+(nclasses+0))=sortedpixelnumarray(k);fill in last column
endfor
endfor
;print,’Pixel totals sorted by class into TOTALSARRAY:
;print,totalsarray
;print,’ ‘

mincarray=fltarr(nclasses+1,nclasses) ;MINCARRAY=Minimum data array

;Now, add up the rows that are the same class:
mincarray(*,0) = totalsarray(*,0) ;Here, it’s (col,row)

i=0 ;i=mincarray row counter
for row=0,leafcount-2 do begin ;row=totalsarray row counter
;(leave out last row of totalsarray)
if ( sortedclassnamearray(row) eq sortedclassnamearray(row+1) ) then begin
mincarray(*,i)=( mincarray(*,i) + totalsarray(*,(row+1)) )
endif else begin
i=i+1
mincarray(*,i)=totalsarray(*,(row+1))
endelse
endfor

;Put pixels in TRAINPIXELSARRAY
for col=0,nclasses-1 do begin
trainpixelsarray[col,*]=mincarray[col,*]
endfor

;Put S-plus right-column pixel totals in STOTALPIXELS
stotalpixels=intarr(nclasses)
stotalpixels=mincarray[6,*]

;MUST HAVE () FOR FUNCTIONS, like TOTAL
;SQUARES [] ARE RECOMENDED FOR VARIABLES

sumrows=TOTAL(trainpixelsarray,2) ;summing DOWN matrix
sumcols=TOTAL(trainpixelsarray,1) ;summing ACROSS matrix

;PRINT OUT CONFUSION MATRIX
;Set up dashes
dasharray=strarr(nclasses)
for i=0,nclasses-1 do begin
dasharray[i]=(“——————————“)
endfor
dash=string(“———————————-“)
dashes = ‘print,FORMAT=”(1A,””,’ + string(nclasses) + ‘A10,””,A)”,dash,dasharray

,dash’
;r = execute(dashes)
print
print
print,’CONFUSION MATRIX FOR TRAINING SITES — Pixel counts’
r = execute(dashes)
fs = ‘print,FORMAT=”(1A,” “,’ + string(nclasses) + ‘(A10, :, ” “),””,A)”,”Class”,alphanamearray

,”TOTALS”‘
r = execute(fs)
r = execute(dashes)
for row=0,nclasses-1 do begin
;build a print format statement using the nclasses variable
;for the number of classes to print pixels for, first variable is
;a string (trainpixelsarray)
fs = ‘print,FORMAT=”(1A,” “,’ + string(nclasses) + ‘(I10, :, ” “),””,I)”,alphanamearray[row],trainpixelsarray[0:nclasses-1,row],sumcols[row]’
r = execute(fs) ; execute the print string
endfor
r = execute(dashes)
fs = ‘print,FORMAT=”(1A,” “,’ + string(nclasses) + ‘(I10, :, ” “),””,I)”,”TOTAL”,sumrows

,TOTAL(sumrows)’
r = execute(fs)
r = execute(dashes)

;DEVELOP PERCENTAGE MATRIX
fracterrorarray = fltarr(nclasses+1,nclasses+1)
for row=0,nclasses-1 do begin
for col=0,nclasses-1 do begin
fracterrorarray(col,row) = (trainpixelsarray(col,row) / sumrows(col))
endfor
endfor

for row=0,nclasses-1 do begin
fracterrorarray(nclasses-0,row) = (sumcols(row) / TOTAL(sumrows))
endfor

for col=0,nclasses-1 do begin
fracterrorarray(col,nclasses-0) = (sumrows(col) / sumrows(col))
endfor

fracterrorarray(nclasses-0,nclasses-0) = (TOTAL(sumrows) / TOTAL(sumrows))
trainpercentarray=fracterrorarray*100

print
print,’CONFUSION MATRIX FOR TRAINING SITES — Percentages’
r = execute(dashes)
fs = ‘print,FORMAT=”(1A,” “,’ + string(nclasses) + ‘(A10, :, ” “),””,A)”,”Class”,alphanamearray

,”TOTALS”‘
r = execute(fs)
r = execute(dashes)
for row=0,nclasses-1 do begin
;build a print format statement using the nclasses variable
;for the number of classes to print pixels for, first variable is
;a string (trainpixelsarray)
fs = ‘print,FORMAT=”(1A,” “,’ + string(nclasses+1) + ‘(F10.2, :, ” “))”,alphanamearray[row],trainpercentarray[0:nclasses,row]’
r = execute(fs) ; execute the print string
endfor
r = execute(dashes)
fs = ‘print,FORMAT=”(1A,” “,’ + string(nclasses+1) + ‘(F10.2, :, ” “))”,”TOTAL”,trainpercentarray[0:nclasses,nclasses]’
r = execute(fs)
r = execute(dashes)

;CALCULATE STATISTICS
N=float(TOTAL(sumrows))

;calc xkk
xkk=float(0)
row=0
for col=0,nclasses-1 do begin
xkk=xkk+trainpixelsarray(col,row)
row=row+1
endfor

nxkk=N*xkk

xkpxpk = float(0)
row=0
for col=0,nclasses-1 do begin
xkpxpk=xkpxpk+(sumrows(col)*sumcols(row))
row=row+1
endfor

kappa=(nxkk-xkpxpk)/((N^2)-xkpxpk)

print
print
print,’ERROR STATSTICS:’
print,’N = ‘,N,’ ( = total number of training points used in tree)
print,’xkk = ‘,xkk
print,’N * xkk = ‘,nxkk
print,’xk+x+k = ‘,xkpxpk
print,’Kappa = ‘,kappa
print

;CALCULATE USER’S & PRODUCER’S ACCURACIES
producers=fltarr(nclasses)
users=fltarr(nclasses)

row=0
for col=0, nclasses-1 do begin
producers[row]=(trainpixelsarray(col,row)/sumcols(row))
row=row+1
endfor
producers=producers*100

row=0
for col=0, nclasses-1 do begin
users[row]=(trainpercentarray(col,row))
row=row+1
endfor

print
print,’ACCURACIES (%)’
print,’—————————————————————————————-‘
print,”Class Producer’s User’s”
print,’—————————————————————————————-‘
for row=0,nclasses-1 do begin
;print,alphanamearray[row],” “,producers[row],” “,users[row]
print,FORMAT='(A,” “,F6.2,” “,F6.2)’,alphanamearray[row],producers[row],users[row]
endfor
print,’—————————————————————————————-‘

print
print
print,”Congradulations! You’re finished. Have a nice day.”
print
print

return,1
end

IP: Logged

matzke
Geo Member
Posts: 37
From: UCSB, CA, USA
Registered: Jan 2001

posted 02-20-2001 04:54 AM
Here’s the input (from training sites extracted from Mad p158,r73 using Kerry’s program):
Splus tree:
Filename: 0sub1tree2-srules.txt

code:

> sub1tree2
node), split, n, deviance, yval, (yprob)
* denotes terminal node
1) root 786 2723.00 pf ( 0.17810 0.18960 0.054710 0.18580 0.181900 0.209900 ) *
2) V7<62.5 501 1367.00 pf ( 0.01397 0.28740 0.085830 0.28140 0.001996 0.329300 )
4) V6<75.5 216 309.30 pf ( 0.03241 0.01852 0.194400 0.00000 0.000000 0.754600 )
8) V5<57.5 49 40.19 dw ( 0.14290 0.00000 0.857100 0.00000 0.000000 0.000000 )
16) V7<45 42 0.00 dw ( 0.00000 0.00000 1.000000 0.00000 0.000000 0.000000 ) *
17) V7>45 7 0.00 db ( 1.00000 0.00000 0.000000 0.00000 0.000000 0.000000 ) *
9) V5>57.5 167 37.76 pf ( 0.00000 0.02395 0.000000 0.00000 0.000000 0.976000 ) *
5) V6>75.5 285 439.90 gv ( 0.00000 0.49120 0.003509 0.49470 0.003509 0.007018 )
10) V5<104 144 44.87 dv ( 0.00000 0.97220 0.006944 0.00000 0.006944 0.013890 )
20) V2<77.5 124 0.00 dv ( 0.00000 1.00000 0.000000 0.00000 0.000000 0.000000 ) *
21) V2>77.5 20 28.33 dv ( 0.00000 0.80000 0.050000 0.00000 0.050000 0.100000 )
42) V3<61.5 5 13.32 pf ( 0.00000 0.20000 0.200000 0.00000 0.200000 0.400000 ) *
43) V3>61.5 15 0.00 dv ( 0.00000 1.00000 0.000000 0.00000 0.000000 0.000000 ) *
11) V5>104 141 0.00 gv ( 0.00000 0.00000 0.000000 1.00000 0.000000 0.000000 ) *
3) V7>62.5 285 481.40 lb ( 0.46670 0.01754 0.000000 0.01754 0.498200 0.000000 )
6) V5<54.5 130 11.73 db ( 0.99230 0.00000 0.000000 0.00000 0.007692 0.000000 ) *
7) V5>54.5 155 124.60 lb ( 0.02581 0.03226 0.000000 0.03226 0.909700 0.000000 )
14) V5<85 145 36.61 lb ( 0.02759 0.00000 0.000000 0.00000 0.972400 0.000000 ) *
15) V5>85 10 13.86 dv ( 0.00000 0.50000 0.000000 0.50000 0.000000 0.000000 ) *

And here’s the output:

code:

ENVI> sruleconv
Current choice for DN assignments to class names = Option # 3
(this can be changed by changing the option number for
CLASSNAMENUMBEROPTION in the code)
Number of classes = 6
Number of rows in input S-plus ASCII rules = 19
Asterisk/leaf count = 10

Case 3 is running: Your class names will be numbered in the
order they first appear in a leaf in the S-plus decision tree.

KEY FOR CLASSIFIED IMAGE DN VALUES
===========================
DN Class name (from ROIs)
===========================
1 dw
2 db
3 pf
4 dv
5 gv
6 lb
===========================

Classifyd rules:

1 5 62 2 15 0
2 4 75 3 8 0
3 3 57 4 7 0
4 5 45 5 6 0
5 5 45 0 6 1
6 3 57 0 0 2
7 4 75 0 0 3
8 3 104 9 14 0
9 0 77 10 11 0
10 0 77 0 11 4
11 1 61 12 13 0
12 1 61 0 13 3
13 3 104 0 0 4
14 5 62 0 0 5
15 3 54 16 17 0
16 3 54 0 17 2
17 3 85 18 19 0
18 3 85 0 19 6
19 3 85 0 0 4

…running the training-site confusion-matrix function, TRAINING_ROI_ERROR_MATRIX

CONFUSION MATRIX FOR TRAINING SITES — Pixel counts
——————————————————————————————————————————–
Class db dv dw gv lb pf TOTALS
——————————————————————————————————————————–
db 136 0 0 0 1 0 137
dv 0 144 0 5 0 0 149
dw 0 0 42 0 0 0 42
gv 0 0 0 141 0 0 141
lb 4 0 0 0 141 0 145
pf 0 5 1 0 1 165 172
——————————————————————————————————————————–
TOTAL 140 149 43 146 143 165 786
——————————————————————————————————————————–

CONFUSION MATRIX FOR TRAINING SITES — Percentages
——————————————————————————————————————————–
Class db dv dw gv lb pf TOTALS
——————————————————————————————————————————–
db 97.14 0.00 0.00 0.00 0.70 0.00 17.43
dv 0.00 96.64 0.00 3.42 0.00 0.00 18.96
dw 0.00 0.00 97.67 0.00 0.00 0.00 5.34
gv 0.00 0.00 0.00 96.58 0.00 0.00 17.94
lb 2.86 0.00 0.00 0.00 98.60 0.00 18.45
pf 0.00 3.36 2.33 0.00 0.70 100.00 21.88
——————————————————————————————————————————–
TOTAL 100.00 100.00 100.00 100.00 100.00 100.00 100.00
——————————————————————————————————————————–

ERROR STATSTICS:
N = 786.000 ( = total number of training points used in tree)
xkk = 769.000
N * xkk = 604434.
xk+x+k = 112888.
Kappa = 0.973536

ACCURACIES (%)
—————————————————————————————-
Class Producer’s User’s
—————————————————————————————-
db 99.27 97.14
dv 96.64 96.64
dw 100.00 97.67
gv 100.00 96.58
lb 97.24 98.60
pf 95.93 100.00
—————————————————————————————-

Congradulations! You’re finished. Have a nice day.

…the battlestation is fully operational.
Nick

IP: Logged

matzke
Geo Member
Posts: 37
From: UCSB, CA, USA
Registered: Jan 2001

posted 02-20-2001 05:27 AM
The output *.cls image is on this page. There are only very minor differences compared to the previous classification with the same tree (rounding up vs. down only affects a small % of the pixels’ class).
http://www.geog.ucsb.edu/~matzke/mad/p158r73-zoompairs.html

Leave a Reply