String
Posts
Co-opting student input to generate qualitative remarks

Co-opting student input to generate qualitative remarks

Reference Co-Pilot by Hwa Chong Institution and the String Team

String Team, Gilbert Ng & Joyce Tan
August 25, 2023

Problem statement

This is a familiar pain that teachers of every graduating class face - each reference/ testimonial takes ~1hr* to generate a base draft. The time taken increases if the student is quiet or has little to write about
Each teacher takes about 30-40 students on average, which suggests ~30-40 manhours drafting for testimonial/ reference writing.
- The estimated benefit in terms of time reduction is 50% (0.5hrs) per reference based on user feedback

*Linguistically inclined educators tend to take shorter of course and this is a rough average based on user interviews

Figure 1: existing workflow for reference writing

Target Audience

Form teachers or teachers with a graduating civics class that has to deal with longer form testimonial or reference writing
- Especially those who may struggle to write (e.g. language familiarity, lack of inspiration)
- Teachers with strong teacher-student-relationship who can get buy-in from their form class to provide necessary details to improve reference writing, or existing workflows that already trust and actively consult information to better represent themselves

Solution

Alongside the String team, Mr Gilbert Ng and Ms Joyce Tan from Hwa Chong Institution (HCI) developed a workflow to reduce the time taken for the first draft of university referrals.

This is likely to work better for Upper Secondary and Junior College students who have a better sense of self and are better to represent themselves better
The workflow referenced the mental model of how teachers would typically go about their task of writing references (see image/ figure 2 below)
Incorporating student feedback is already part of most teachers’ existing workflow in order to complement the writing process
Form enables validation to check for minimum length such that there is actually something to process to minimize the issue of nothing in, garbage out - admittedly, not all student input is great/ helpful. This problem persists with or without generative AI

Figure 2: comparison of reference co-pilot workflow and existing teacher workflows

The team refrained from doing a full-stack app that required both a lot of development and maintenance work that could restrict flexibility of pivots/ extensions.

Instead, familiar: no/low-code tools were chosen as far as possible:

Figure 3: HCI developed their prototype using 3rd party solutions while developing on government tooling in parallel

Sample output

Figure 4 - Part I of the reference co-pilot output: Combined output after parsing student input into different LLM prompts. Actual student names are obscured

Figure 5: Part 2 of reference co-pilot output (original student input)

Extensions

Output can be easily tweaked to produce testimonials for SGCs as well, as the format and style of writing is largely similar.
Variations for different college requirements (e.g. 200 word limit, 500 word limit) are simply text summarization tasks after the base output is created. This can also be achieved programmatically with a proper workflow design
Rebuilding using FormSG and Plumber to manage data storage concerns
Few shot prompting and ingesting style guides for reference writing that schools typically already have so the output is more consistent with school writing norms

Considerations

1 Design

1.1 AI Ethics and Policy Challenges

Should we inform students that submitting the form involves an AI output? Hwa Chong’s answer is ‘yes’. The String team is especially heartened that school leadership is leaning towards this direction for transparency with students

Figure 6: What students see at the beginning so they can decide whether to participate in this AI workflow

If the student doesn't consent, he/she can opt out and does not need to spend time to fill up the form.

The String team earlier experimented with sending a copy of the output to the the student to eliminate any doubt that this was fed through a Large Language Model but with the same assurance and caveats that another layer of teacher review will be done.

1.2 Hallucination Mitigation

Any generative AI has the potential to go off, even with temperature* set at 0.

*Temperature is a parameter that regulates randomness. A higher temperature value typically makes the output more diverse and creative but might also increase its likelihood of straying from the context.

The following are ways to manage this risk of hallucination which the team acknowledges will inevitably exist (in the near future):

Hallucination mitigation measure #1: reminder not to copy wholesale

This is included as the header for all outputs, and teachers are also briefed on how it is their responsibility to ensure that the final referral submitted has been checked for accuracy.

Hallucination mitigation measure #2: inclusion of original input

Most schools also have an established practice of checking any student self-declared information against formal school records so this layer of check also exists.

The final layer of check is by the schools who have testimonial writing teams* to vet the content. This is typically done by language teachers and we hope that this at least reduces the pains of grammar, spell check and other repetitive tasks that could be alleviated by AI.

*not all schools have this and some discretion is advised by the teacher processing output from Reference Co-Pilot

We sincerely believe that the caveats/ hallucination mitigation measures are more meaningful and we co-create rules outside of the app to ensure human-in-the-loop rather than relying on just the user to take the User Interface (UI) cues.

2 Cost

There are a few 3rd party services listed that are not free - the team doesn’t endorse a particular tool

The table below indicates the pricing for OpenAI/ ChatGPT:

Figure 7: many LLM cost calculators available in the market for cost estimates

The workflow automation/ integration tool of choice may incur an additional cost as well.

We will also update this section once we have actual cost data. If your school is interested in deploying your own custom LLM solution which involves a pay-per-use, it may be helpful to gauge the volume using such calculators while setting a billing cap to avoid billing shock.

Conclusion

Workflow automation and integration holds immense potential to impact teaching and learning. String is excited that the Hwa Chong EdTech is not only actively maintaining the product but also taking the lead to shape discussions of AI ethics.

We hope this article provided some inspiration on how you can get started to leverage on emerging technologies to reduce teacher workload.

Interested in building this workflow or bringing this to your school?

Reach out to us on WhatsApp

Or join us at the upcoming SgLDC Virtual Meet on 4-5th September!

Note: The String team is rewriting a tutorial to achieve the same flow using FormSG, Plumber (created by Open Government Products). Stay tuned by subscribing (: