Start by establishing a handful of test cases - core use cases and failure cases that you want to ensure your prompt can handle. As you explore modifications to the prompt, use promptfoo eval to rate ...
Well-documented code with descriptive comments. Complexity annotations and usage examples included. Input validation for robust solutions. Version control history with dated commits for each change.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results