conclusion.tex (1952B)
1 \section{Conclusion} 2 \label{sec:fitb:conclusion} 3 In this chapter, we show that discriminative relation extraction models can be trained efficiently on unlabeled datasets. 4 Unsupervised relation extraction models tend to produce impure clusters by enforcing a uniformity constrain at the level of a single sample. 5 We proposed two losses (named RelDist) to effectively train expressive relation extraction models by enforcing the distribution over relations to be uniform---note that other target distributions could be used. 6 In particular, we were able to successfully train a deep neural network classifier that only performed well in a supervised setting so far. 7 We demonstrated the effectiveness of our RelDist losses on three datasets and showcased its effect on cluster purity. 8 9 While forcing a uniform distribution with the distance loss \loss{d} might be meaningful with a low number of predicted clusters, it might not generalize to larger numbers of relations. 10 Preliminary experiments seem to indicate that this can be addressed by replacing the uniform distribution in Equation~\ref{eq:fitb:uniformity} with the empirical distribution of the relations in the validation set or any other appropriate law if no validation set is available.% 11 \sidenote{In practice, Zipf's law (described in the margin of Section~\ref{sec:relation extraction:oie}) seems to fit the observed empirical distribution quite well.} 12 This would allow us to avoid the \hypothesis{uniform} assumption. 13 14 All models presented in this chapter make extensive independence assumptions. 15 As inferred in Section~\ref{sec:fitb:variants} and shown in subsequent work \parencite{selfore,mtb}, this could be solved with sentence representations pre-trained with a language modeling task. 16 Furthermore, the fill-in-the-blank model is inherently sentence-level. 17 In the next chapter, we study how to build an unsupervised aggregate relation extraction model using a pre-trained \bertcoder.