Download RightNow Technologies Candidate Questionaire

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia, lookup

Transcript
RightNow Technologies Candidate Questionaire
1. Imagine you had complete control over your surroundings and could create
the ultimate employment situation.
a. Describe your ideal job.
Working, as part of a R&D group, on core technology issues related to natural language
processing and text mining, developing research prototypes and working with product
groups to transfer the successful approaches to products.
b. Describe the specific technologies you would be using.
Object-oriented programming, such as C++ or Java, for developing the main research
prototypes, and Perl for text processing and higher-level scripts.
c. What percentage of time would you spend on heads-down technical
work vs. other?
The ideal percentage would be 70% for heads-down, individual technical work and 30%
for the other, where I interpret “other” as meaning meeting with fellow R&D and product
group members.
d. Describe the work environment.
I consider myself to be a team player and would thrive in an environment where
communication and collaboration are actively pursued. A relaxed environment, with few
layers of management where initiative is encouraged.
2. Describe a past technical achievement of which you are especially proud.
Three years ago, as I was beginning to formulate my research, I came across a
fundamental problem in clustering; how to combine different clustering systems. I
though it was an under-explored but very important problem in data mining. My advisor
could not help me much in this domain since it was not her main expertise. I set out to
explore it myself, from conception of algorithms to implementation. My work was
accepted on a highly selective conference (PKDD 2004, 581 submissions, 18%
acceptance rate) and was honored with the Best Student Paper Award.
3. On a scale of 1 – 5, (1=no knowledge, 5=expert, demonstrated by significant
experience) please rate your knowledge of the following technologies:
Technology
C
C++
UNIX software development
Microsoft Windows software development
.NET/C#
Rating
5
4
5
2
1
Years
Experience
8
5
8
2
0
SQL
1
0
Java
1
0
Web application (not web page) development
1
0
HTML/DHTML/Javascript
1
0
Comments? I have been using Perl as a scripting language for the past 8
years for text processing tasks and to write scripts that control the execution of research
software. I have also been using Matlab for the past 8 years. I also have experience in
developing code to process large amounts of data. This requires developing code to be
run in parallel on multiple machines using pmake and rexport.
4. In your most recent work experience, describe your role – manager, project
lead, member of a small team, member of a large team or individual
contributor. Which role do you prefer?
I was always a member of a small team or working individually. I would prefer to work
as a member of a small team.
5. Why should RightNow Technologies hire you over other candidates?
Because I can bring the interdisciplinary knowledge that this R&D position requires.
Working on R&D means having the foundations to offer solutions in a range of different
applications. I have a publication record on natural language processing, data mining
and speech recognition, all based on statistical learning algorithms. And because I love
pursuing and implementing research ideas and I enjoy working very hard to bring them
to fruition.
6. Are you legally eligible for employment in the United States?
Yes I am. I currently have a F-1 student visa, which can be extended to Optional
Practical Training (OPT).
Applied Research Candidate Questionaire
1. What is the largest software project you have worked on, both in number of team
members as well as (approximate) number of lines of code?
The SRI Decipher system, a large vocabulary speech recognition system, of tens of
thousands lines of code. And the Microsoft Research Whisper system, a similar large
vocabulary speech recognition systems, again of tens of thousands lines of codes. In
terms of team members, I have been working with about 1-2 people.
2. Describe your experiences with handling data going into and out of a database at
the code level.
I have been using standard tools for this such as cvs and rcs for Unix.
3. Describe your debugging skills (tools used, processes, etc):
I have been using the gdb debugger as well as some graphical environments recently
available (Eclipse). I have also used Microsoft Visual C++ under Windows. Processes
involved are informal, such as emailing or talking to people.
4. Describe the most difficult bug you solved, and what made the debugging
process particularly hard.
Some years ago, while developing software on HTK, a software toolkit for research
prototype development in speech recognition, I was able to find that the order the
speech models are stored in the file can actually change the results of the experiment.
This was a very hard bug to find because it was the last thing that someone would
expect.
5. On a scale of 1 – 4, (1=no knowledge, 2=research or coursework, 3=prototype,
4=shipped production code) please rate your knowledge of the following
technologies:
Published
Technology
Rating
(yes/no)
Information Retrieval
2
no
Natural Language Processing
3
yes
HTTP/spidering
1
no
Text Clustering
3
yes
Text Classification
3
yes
Text Summarization
2
no
Swarm Intelligence/Ant Colony
1
no
Optimization
Collaborative Filtering
2
no
Ontology/Topic extraction
1
no
Data Mining
3
yes
Machine Learning
3
yes
Please expand on individual technologies where appropriate:
Years
Experience
1
5
0
5
5
1
0
1
0
5
5
6. What is the most interesting emerging trend in any of the above areas?
One of the most fascinating trends is the move from supervised to semi-supervised
and unsupervised approaches. Collecting, annotating and cleaning training data are
by far the most expensive steps in the process of developing new applications in
NLP. Ways to reduce the cost of these steps are crucial. In addition, I think that
another fascinating trend is dealing with huge amounts of data. Issues of scalability
and speed emerge.