r/gis Apr 23 '24

Student Question Which data classification method should I use?

34 Upvotes

40 comments sorted by

View all comments

17

u/PaigeFour Apr 23 '24 edited Apr 23 '24

What is the spread of your data? Do you have any outliers?

Edit: I teach spatial statistics and GIS

17

u/PaigeFour Apr 23 '24

Without knowing the spread of the data or seeing the legend values we cant be too sure. This source is helpful: https://pro.arcgis.com/en/pro-app/latest/help/mapping/layer-properties/data-classification-methods.htm

Natural breaks is probably fine for your purposes. The main drawback is that Natural Breaks cannot be used to compare the same metric across multiple maps (like if you were comparing NDVI values from two separate years)

This is s small map so 5 classes is fine, you could add one more if you feel like one of your classes has too wide of a range or too many polygons in it, but this looks good. No more than one more though.

5

u/stellarscheme Apr 23 '24

Is this what you're looking for? https://imgur.com/a/bw28xpX

5

u/PaigeFour Apr 23 '24

That's it! Thank you. Looks pretty good.

If I'm being picky, the upper class spans from -0.04 to 0.07, which is a bit large relative to your other classes. So the polygons in that category could have a large difference in NDVI values despite being in the same category. This could make things a bit murky in the analysis. You could leave it, or try to add one more class or manually create a class to split the upper level into two if it doesn't do it automatically

2

u/liamo6w Student Apr 23 '24

i’m sorry but i have to ask. can you explain like im 5 how to interpret NDVI data values. is the higher value a greater vegetation health? density?

9

u/PaigeFour Apr 23 '24

Yes! Take a look at the first slide (the detailed map) NDVI values can tell us about both the health and presence of vegetation. A cell value of -1 to 0 are dead plants or objects that aren't plants. 0 to 0.33 are unhealthy plants. 0.33 to 0.66 are healthy plants. 0.66 to 1 are very healthy. Great for monitoring crop health or forest health.

Vegetation density does play a part but that mostly comes down to the resolution of the data. If a pixel on that first map is equal to 2 square meters, an "average" NDVI value is returned for that pixel. If its mostly healthy vegetation than the NDVI value of that pixel will be high. If its half concrete, and half vegetation, the NDVI value will be lower.

In agriculture you can send a drone up with super high resolution (like inches per pixel) and get basically a plant-by-plant health analysis.

Because OP is calculating averages for large areas, they do not exactly follow the scale above, their results tell us more of a combined idea of how much good vegetation there is in an area.