Sociocultural ML data collection
Related: Jo and Gebru (2019) point out that many AI fairness problems are rooted in the data collection and annotation process, and offer “five key approaches in document collection practices in archives that can inform data collection in sociocultural ML.” These can be summarized as consent , inclusivity , power , transparency , and ethics & privacy, with details in Table 1 of their paper: Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning.