Open innovation often enjoys large quantities of submitted content. Yet the need to effectively process such large quantities of content impede the widespread use of open innovation in practice. This article presents an exploration of needs-based open innovation using state-of-the art natural language processing (NLP) algorithms to address existing limitations of exploiting large amounts of incoming data. The Semantic Textual Similarity (STS) algorithms were specifically developed to compare sentence-length text passages and were used to rate the semantic similarity of pairs of text sentences submitted by users of a custom open innovation platform. A total of 341 unique users submitted 1,735 textual problem statements or unmet needs relating to multiple topics: cooking, cleaning, and travel. Scores of equivalence generated by a consensus of ten human evaluators for a subset of the needs provided a benchmark for similarity comparison. The semantic analysis allowed for rapid (1 day per topic), automated screening of redundancy to facilitate identification of quality submissions. In addition, a series of permutation analyses provided critical crowd characteristics for the rates of redundant entries as crowd size increases. The results identify top modern STS algorithms for needfinding. These predicted similarity with Pearson correlations of up to .85 when trained using need-based training data and up to .83 when trained using generalized data. Rates of duplication varied with crowd size and may be approximately linear or appear asymptotic depending on the degree of similarity used as a cutoff. Semantic algorithm performance has shown rapid improvements in recent years. Potential applications to screen duplicates and also to screen highly unique sentences for rapid exploration of a space are discussed.

This content is only available via PDF.
You do not currently have access to this content.