Read In and format Finnish survey text responses from `svydesign` object
Source:R/01b_prepare_svydesign.R
fst_prepare_svydesign.Rd
Creates a dataframe in CoNLL-U format from a `svydesign` object including Finnish text using the [udpipe] package and a Finnish language model plus weights if these are included in the `svydesign` object and any columns added through `add_cols`.Stopwords and punctuation are optionally removed if the the `stopword_list` argument is not "none".
Usage
fst_prepare_svydesign(
svydesign,
question,
id,
model = "ftb",
stopword_list = "nltk",
language = "fi",
use_weights = TRUE,
add_cols = NULL,
manual = FALSE,
manual_list = ""
)
Arguments
- svydesign
A `svydesign` object which contains an open-ended question.
- question
The column in the dataframe which contains the open-ended question.
- id
The column in the dataframe which contains the ids for the responses.
- model
A language model available for [udpipe], such as `"ftb"` (default) or `"tdt"` which are available for Finnish.
- stopword_list
A valid Finnish stopword list, default is `"nltk"`, or `"none"`.
- language
two-letter ISO code for the language for the stopword list
- use_weights
Optional, whether to use weights within the `svydesign`
- add_cols
Optional, a column (or columns) from the dataframe which contain other information you'd like to retain (for instance, dimension columnns for splitting the data for comparison plots).
- manual
An optional boolean to indicate that a manual list will be provided, `stopword_list = "manual"` can also or instead be used.
- manual_list
A manual list of stopwords.
Details
`fst_prepare_svydesign()` produces a dataframe containing Finnish survey text responses in CoNLL-U format with stopwords optionally removed.
Examples
if (FALSE) { # \dontrun{
i <- "fsd_id"
svy_child <- survey::svydesign(id=~1, weights= ~paino, data = child)
fst_prepare_svydesign(svy_child, question = "q7", id = i, use_weights = TRUE)
svy_d <- survey::svydesign(id = ~1, weights = ~paino, data =dev_coop)
fst_prepare_svydesign(svy_d, question = "q11_2", id = i, add_cols = 'gender')
fst_prepare_svydesign(svy_d, 'q11_2', i, 'finnish-ftb', 'nltk', 'fi')
unlink("finnish-ftb-ud-2.5-191206.udpipe")
unlink("finnish-tdt-ud-2.5-191206.udpipe")
} # }