Skip to content

There is no 'Other' category under 'Active Growth Period' #17

@sscrewston

Description

@sscrewston
> summary(plants)
 Scientific_Name      Duration         Active_Growth_Period Foliage_Color          pH_Min          pH_Max      
 Length:5166        Length:5166        Length:5166          Length:5166        Min.   :3.000   Min.   : 5.100  
 Class :character   Class :character   Class :character     Class :character   1st Qu.:4.500   1st Qu.: 7.000  
 Mode  :character   Mode  :character   Mode  :character     Mode  :character   Median :5.000   Median : 7.300  
                                                                               Mean   :4.997   Mean   : 7.344  
                                                                               3rd Qu.:5.500   3rd Qu.: 7.800  
                                                                               Max.   :7.000   Max.   :10.000  
                                                                               NA's   :4327    NA's   :4327    
   Precip_Min      Precip_Max     Shade_Tolerance      Temp_Min_F    
 Min.   : 4.00   Min.   : 16.00   Length:5166        Min.   :-79.00  
 1st Qu.:16.75   1st Qu.: 55.00   Class :character   1st Qu.:-38.00  
 Median :28.00   Median : 60.00   Mode  :character   Median :-33.00  
 Mean   :25.57   Mean   : 58.73                      Mean   :-22.53  
 3rd Qu.:32.00   3rd Qu.: 60.00                      3rd Qu.:-18.00  
 Max.   :60.00   Max.   :200.00                      Max.   : 52.00  
 NA's   :4338    NA's   :4338                        NA's   :4328    

| You nailed it! Good job!

  |=============================================================================                                    |  68%
| summary() provides different output for each variable, depending on its class. For numeric data such as Precip_Min,
| summary() displays the minimum, 1st quartile, median, mean, 3rd quartile, and maximum. These values help us understand
| how the data are distributed.

...

  |=================================================================================                                |  72%
| For categorical variables (called 'factor' variables in R), summary() displays the number of times each value (or
| 'level') occurs in the data. For example, each value of Scientific_Name only appears once, since it is unique to a
| specific plant. In contrast, the summary for Duration (also a factor variable) tells us that our dataset contains 3031
| Perennial plants, 682 Annual plants, etc.
[`](`url`)
...

  |======================================================================================                           |  76%
| You can see that R truncated the summary for Active_Growth_Period by including a catch-all category called 'Other'.
| Since it is a categorical/factor variable, we can see how many times each value actually occurs in the data with
| table(plants$Active_Growth_Period).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions