A weekly summary of news and notes for MLC@Home
A little late after the long weekend in the US, and in general, a lot of work behind the scenes. It was a relatively quiet week in the forums.
The biggest news is that there's been a hitch with Dataset 3 WUs, in that we're having trouble generating data that the networks can learn. As a refresher, Datasets 1 and 2 are what are being computed now and are nearing completion. Dataset 3 is supposed to train similar RNN networks, but go "wide" instead of "deep" (100 different training sets, only 100-ish examples of each vs. 5 training sets and 10000 of each). As such instead of mimic-ing the 5 simple machines Datasets 1 and 2 are computing, we would instead use 100 randomly generated deterministic finite automata, and train networks to mimic the behavior of these automata. Surprisingly, we're having trouble learning these automata using the networks we have, which we suspect is a bug in our data generation code we're still tracking down and taking us a lot longer than planned.
Because of this, we're pushing up work on Dataset 4. Dataset 4 will be the first to train Convolutional networks (CNNs) on variants of MNIST, specifically those used by the TrojAI project and the "BadNets" paper. The hope is that with enough examples of each network, we can show the same weight-space separation we're able to show with Dataset 1 and 2 on simple RNNs is *also* present on CNN networks, showing greater application of weight space analysis for identifying training data. An updated client for Dataset 4 support is already underway, and should take too long. Hopefully this week, but given the unforeseen issues with Dataset 3, we're hesitant to state a deadline.
Meanwhile, work on debugging Dataset 3 continues. As does paper writing for a conference deadline at the end of the month.
News:
- Dataset 3 debugging continues
Client changes for Dataset 4 underway.
Client application issues have settled down after a few weeks of turmoil.
We'll do an official release of a preliminary dataset (1+2) once we have at least 1000 examples of each machine type, and we're getting closer!
We can now confirm the new server is ordered and in process.
We haven't forgotten about badges! We're just focused on the paper and new WU generation at the moment. That said, if volunteers would like to offer potential designs for badges, head on over the the forums and join the discussion.
Project status snapshot:
Tasks Tasks ready to send 19271 Tasks in progress 19684 Users With credit 661 Registered in past 24 hours 65 Hosts With recent credit 1874 Registered in past 24 hours 33 Current GigaFLOPS 27494.27
Dataset 1 and 2 progress:
SingleDirectMachine 10002/10004 EightBitMachine 9962/10006 SingleInvertMachine 10001/10003 SimpleXORMachine 10000/10002 ParityMachine 537/10005 ParityModified 90/10005 EightBitModified 3729/10006 SimpleXORModified 10005/10005 SingleDirectModified 10004/10004 SingleInvertModified 10002/10002
Last week's TWIM Notes: Aug 31 2020
Thanks again to all our volunteers!
-- The MLC@Home Admins
Source: https://www.mlcathome.org/mlcathome/for ... .php?id=72