I just spent this morning catching up on the work of @simon on LLM tooling. I installed and ran about 10 different AI models locally.
It's all really easy and I'm loving how open and experimental the community is. Folks are doing thousands of hours of work developing and training models and I can have them running on my home computer in just a few minutes.
Lots of details: https://nelsonslog.wordpress.com/2023/08/16/running-my-own-llm/
@nelson @simon I asked it how to get from Gate B5 to Gate C10 at SFO / and it told me to take an Uber – https://millsfield.sfomuseum.org/wayfinding/#from=1763588483&to=1763588253
@thisisaaronland @nelson frustratingly, that one falls into the category of "prompts that I'm pretty sure wouldn't work, but I can't actually articulate why I know that"
@simon @thisisaaronland I definitely don't look to LLMs for factual information. What's interesting about "what is the capital of France" isn't that it knows the answer is Paris, it's the ability to ask and receive answers in natural language, to expand on them, to manipulate them as language.
@nelson @simon “Uncensored“ means the person fine-tuning the model removed data that injected “alignment“, i.e. whether LLMs outputs are aligned with societal values.
We frown on murder, hurting kids, bombs, spamming, etc. so ‘instruction tuning’ datasets add ‘refusal’ examples for those topics and others.
Folks making ‘uncensored’ models use the same datasets, removing refusals, either because they disagree with the alignment or because the model ends up refusing tangentially related prompts.
@cypherfox @nelson I had a look at the data for one of those "uncensored" models and the approach seemed almost embarrassingly naive to me - they pretty much filtered out any text that included a denial to do something or "as a large language model ..."
But... those models to appear to perform well! Too much tuning on how to reject requests does look like it might have a negative affect on icefall performance
@simon @nelson Embarrassingly naive is one way to put it.
I…disagree pretty strongly with some of the redactions, e.g. one set of uncensored models remove all references to ‘transgender’ regardless of whether it’s in a refusal or not, which also removes some completely alignment-unrelated instruction/response pairs.
Yes, removing refusals definitely seems to make the models work better for all questioning & I use them myself almost exclusively, but it’s important to know the biases involved.
@simon @nelson Probably the best way to build a dataset like that would be to take every response, throw it at an Oracle (i.e. another LLM, even a much simpler one) and ask, ‘Is this a refusal?’ and token-limit the output to be ‘Yes’ or ‘No’ using a logits processor.
That would be a great way to filter a dataset, but it would take a lot longer than basically ‘fgrep -v’.