1. context dependent
There seem to be two broad subcategories here:
2. vague, ambiguous, disambiguated only by context, or otherwise failing to yield a single cannonical database answer.
Some of the particular cases noted so far include:
As long as the query is interpretable, only utterances that appear not to be attempts to speak normal conversational English will be excluded. For example, we should exclude attempts to speak some imagined form of "computerese" rather than normal English: "Origin Dallas, destination Boston, list flights."
4. other unanswerable queries
Some subcases:
Utterances that are clearly designed to try to break the system should be excluded: "Given that city A is Oakland and city B is Fort Worth show me all flights from A to B."
NOTES:
Minor syntactic or semantic ill-formedness -- if the query is interpretable, it will be accepted, unless it is so ill-formed that it is clear that it is not intended to be normal conversational English.
Presupposition failures -- all presuppositions about the number of answers (either existence or uniqueness) will be ignored. These are the only types of presupposition failures noted to date. Any other types of presupposition failure that make the query truly unanswerable will presumably result in the wizard being unable to generate a database query, and will be ruled out on those grounds.
Multi-sentence utterances -- These will not automatically be ruled out. The examples cited so far are clearly interpretable as expressing multiple constraints that can be combined into a single query.
PROCEDURE FOR CLASSIFYING SENTENCES:
There are five general categories of non-class-A utterances, with an important special subcase of context-dependent utterances. We will therefore use the following code:
C -- Hopelessly context dependent; COULD NOT reasonably be uttered with an unambiguous context-independent reading. C1 -- Context dependent, but COULD reasonably be uttered with an unambiguous context-independent reading. V -- vague, ambiguous, etc. I -- ill-formed (grossly). U -- unanswerable (for other reasons). N -- noncooperative subject.NOTE: There are some context-dependent queries that could be forced to have a context-independent interpretation, but it would be unreasonable to do so, because of the large amount of data that would be retrieved; for example, "Where are connections made?" Such queries will be classified as C rather than C1.