2015英特尔ISEF获奖作品摘要 机器人与智能机器(人工智能) Intel Finalist Abstract Robotics and Intelligent Machines (AI)

获奖作品基本信息

年份 2015
学科 机器人与智能机器 Robotics and Intelligent Machines
国家/州 United States of America

获奖作品名称

Development of an Authorship Identification Algorithm for Twitter Using Stylometric Techniques

获奖作品摘要

I developed software that implements semi-supervised learning to dramatically improve accuracy when stylometrically attributing an unidentified tweet to the correct author from a set of known Twitter authors. Existing stylometric techniques generally do not perform well on short texts. Software written in Python streamed, preliminarily processed, and stored 1000 tweets each from up to 30 prolific authors on Twitter. Traditional and flexible bigrams, as well as their frequencies of occurrence, were extracted from both the authors’ known tweets and the unknown tweet, forming each author’s profile. These bigrams were then used as tokens for a Naive Bayes classifier which returned the probability of each author having written the unknown tweet. The first, second, and third most likely authors were determined by the classifier and written as output. After repeating this process multiple times, the percent accuracy of identifying the correct author was calculated. A program was completed that would, to a significant degree of accuracy, identify the author of an unknown tweet. Furthermore, it was found that excluding retweets, using a combination of flexible and traditional bigrams, and other techniques produced the most effective algorithm for stylometrically identifying the author of a tweet. With 10 authors, the algorithm correctly identified the author of the tweet with 73 percent accuracy on the first guess and with 87 percent accuracy within the top three guesses, showcasing the potential of stylometric techniques in application to extremely short messages. Moreover, this algorithm has significant potential in investigating anonymous cyber-crimes committed over social media.


高中生科研 英特尔 Intel ISEF
资讯 · 课程 · 全程指导
请扫码添加微信好友

有方科研教育背景提升


高中生科研竞赛 英特尔 Intel ISEF 简介

英特尔国际科学与工程大奖赛,简称 "ISEF",由美国 Society for Science and the Public(科学和公共服务协会)主办,英特尔公司冠名赞助,是全球规模最大、等级最高的中学生的科研科创赛事。ISEF 的竞赛学科包括了所有数学、自然科学、工程的全部领域和部分社会科学。ISEF 素有全球青少年科学竞赛的“世界杯”之美誉,旨在鼓励学生团队协作,开拓创新,长期专一深入地研究自己感兴趣的课题。

>>> 实用链接汇总 <<<

英特尔 ISEF 竞赛详细介绍

英特尔 ISEF 全程指导方案

· 数学 · 物理 · 化学 · 生物 · 计算机 · 工程 ·

学科简介:机器人与智能机器 Robotics and Intelligent Machines

Studies in which the use of machine intelligence is paramount to reducing the reliance on human intervention.

Subcategories:

Biomechanics (BIE): Studies and apparatus which mimic the role of mechanics in biological systems.

Cognitive Systems (COG): Studies/apparatus that operate similarly to the ways humans think and process information. Systems that provide for increased interaction of people and machines to more naturally extend and magnify human expertise, activity, and cognition.

Control Theory (CON): Studies that explore the behavior of dynamical systems with inputs, and how their behavior is modified by feedback.  This includes new theoretical results and the applications of new and established control methods, system modelling, identification and simulation, the analysis and design of control systems (including computer-aided design), and practical implementation.

Machine Learning (MAC)Construction and/or study of algorithms that can learn from data.

Robot Kinematics (KIN): The study of movement in robotic systems.

Other (OTH)Studies that cannot be assigned to one of the above subcategories. If the project involves multiple subcategories, the principal subcategory should be chosen instead of Other.

翰林国际教育资讯二维码