Find top and unique top words for different groups of participants. Data is split based on different values in the `field` column of formatted data. Results will be shown within the plots pane.
Usage
fst_freq_compare(
data,
field,
number = 10,
norm = NULL,
pos_filter = NULL,
strict = TRUE,
use_svydesign_weights = FALSE,
use_svydesign_field = FALSE,
id = "",
svydesign = NULL,
use_column_weights = FALSE,
exclude_nulls = FALSE,
rename_nulls = "null_data",
unique_colour = "indianred",
title_size = 20,
subtitle_size = 15
)
Arguments
- data
A dataframe of text in CoNLL-U format with additional `field` column for splitting data.
- field
Column in `data` used for splitting groups
- number
The number of n-grams to return, default is `10`.
- norm
The method for normalising the data. Valid settings are `"number_words"` (the number of words in the responses), `"number_resp"` (the number of responses), or `NULL` (raw count returned, default, also used when weights are applied).
- pos_filter
List of UPOS tags for inclusion, default is `NULL` which means all word types included.
- strict
Whether to strictly cut-off at `number` (ties are alphabetically ordered), default is `TRUE`.
- use_svydesign_weights
Option to weight words in the wordcloud using weights from a svydesign object containing the raw data, default is `FALSE`
- use_svydesign_field
Option to get `field` for splitting the data from the svydesign object, default is `FALSE`
- id
ID column from raw data, required if `use_svydesign_weights = TRUE` and must match the `docid` in formatted `data`.
- svydesign
A svydesign object which contains the raw data and weights.
- use_column_weights
Option to weight words in the wordcloud using weights from formatted data which includes addition `weight` column, default is `FALSE`
- exclude_nulls
Whether to include NULLs in `field` column, default is `FALSE`
- rename_nulls
What to fill NULL values with if `exclude_nulls = FALSE`.
- unique_colour
Colour to display unique words, default is `"indianred"`.
- title_size
size to display plot title
- subtitle_size
size to display title of individual top words plot
Examples
fst_freq_compare(fst_child, 'gender', number = 10, norm = "number_resp")
fst_freq_compare(fst_child, 'gender', number = 10, norm = NULL)
s <- survey::svydesign(id=~1, weights= ~paino, data = child)
c2 <- fst_child_2
c <- fst_child
g <- 'gender'
fst_freq_compare(c2, g, 10, NULL, NULL, TRUE, TRUE, TRUE, 'fsd_id', s)
fst_freq_compare(c, g, use_column_weights = TRUE, strict = FALSE)