WEBVTT 1 00:00:03.000 --> 00:00:09.420 Josh Moore: I think we're good. Okay, I'll share my screen for just a second. I want to make sure everyone knows how are using the hacking D document. 2 00:00:10.469 --> 00:00:11.340 Josh Moore: Um, 3 00:00:16.619 --> 00:00:31.860 Josh Moore: So the links in the chat. I'm at the bottom of the file. You can see all the notes from the previous conversation. So the European and Asian conversation, plus a couple of crazy East Coasters um they get lots of credit 4 00:00:33.450 --> 00:00:46.800 Josh Moore: So we'll be filling out this session. I've left the topics that were discussed in the morning. So if anyone wants to recap, we can go. We can talk about those. Put your name beside them and then we'll start to order the topics by by interest. So the more 5 00:00:51.360 --> 00:00:54.060 Josh Moore: The more names that are besides something then 6 00:00:55.320 --> 00:01:03.720 Josh Moore: The mortal bubble to the top. If you have questions, and then you want to talk about anything you want to share, by all means, um, 7 00:01:05.340 --> 00:01:10.050 Josh Moore: Alright, so that's probably enough for me. Before we dive into things so 8 00:01:11.670 --> 00:01:25.650 Josh Moore: who showed up. Alright, so go around the room, my participants list will do the non dundonians first alphabetically, you can start to prepare yourself. 9 00:01:27.870 --> 00:01:43.410 Josh Moore: Is mark at the top of the list. I assume that all change. So I think Bill you're at the top, you want to take about a minute to introduce yourself. Someone yeah said has a clock. He will let you know when you're running over but this went pretty quickly. Last time, so go for 10 00:01:45.420 --> 00:02:00.840 Bill Katz: Me. Yes. Yeah. So, I'm Bill cats. I'm with Jeanette Leah, the fly em team I designed David and I'm working pretty extensively on data stores matches to image. 11 00:02:01.440 --> 00:02:19.680 Bill Katz: Formats or segmentation. But basically, a little bit more general. In terms of data types. But the key thing. I'm also very interested in is branch burgeoning in creating a GitHub like way of starting to do data sharing among you know various image formats, etc. 12 00:02:25.440 --> 00:02:25.860 Josh Moore: That arena. 13 00:02:27.180 --> 00:02:28.680 Bill Katz: That's it. Thanks. 14 00:02:31.170 --> 00:02:35.460 Caterina Strambio: Hello, Katrina stranded Castilleja I am here. 15 00:02:36.510 --> 00:02:44.100 Caterina Strambio: Primarily, to make sure that we are, I'm working primarily on the metadata side microscopy meta data. 16 00:02:45.360 --> 00:02:47.400 Caterina Strambio: To extend the Omi model. 17 00:02:48.420 --> 00:03:08.340 Caterina Strambio: And I'm here, primarily to make sure that what we're doing, you know, it matches what you guys. I mean, what the the data container will will have because obviously the two things are very connected that is going to be as a different community call about the metadata aspects. 18 00:03:10.110 --> 00:03:10.440 Caterina Strambio: Thank you. 19 00:03:12.330 --> 00:03:12.600 Josh Moore: I'm here. 20 00:03:16.830 --> 00:03:16.980 Damir Sudar: I'm 21 00:03:18.360 --> 00:03:25.380 Damir Sudar: Working for a small company called Quantitative imaging systems and closely associated with Oregon Health and Science University important 22 00:03:26.100 --> 00:03:44.850 Damir Sudar: My interest. This primarily it's very similar to cover pettiness, because we're on the working group trying to come up with that metadata standards called set. And I'm also very interested in the data container site because a lot of our data is in 23 00:03:45.960 --> 00:03:57.210 Damir Sudar: Formats right now that are difficult to handle. And so a new the new way of storing pixel data is desperately needed lovely part of that. 24 00:03:58.950 --> 00:03:59.310 Thanks. 25 00:04:02.640 --> 00:04:03.030 Josh Moore: David 26 00:04:06.000 --> 00:04:14.670 David Pinto: Hi, I'm David from micro University of Oxford. I've been here this morning. A similar to discussions going to be a good evening and 27 00:04:15.180 --> 00:04:24.060 David Pinto: I've mostly worker at the moment in acquisition software for microscopes. So I'm interested. How can you know use these not only to save data. 28 00:04:24.420 --> 00:04:35.430 David Pinto: In, you know, recognize you for much, and as well, like the meta data since our focuses is, you know, quite exotic microscopes and these devices. So unless you can make use of these tools. 29 00:04:40.290 --> 00:04:40.740 Josh Moore: Davis. 30 00:04:41.370 --> 00:04:46.650 Davis Bennett: Hey, spin it, I'm engineering a research campus and I work with a project team that is 31 00:04:48.720 --> 00:04:57.300 Davis Bennett: Really big fix them data sets and generating tons of extra derived volumes from those and then moving them around three and making them all shareable 32 00:04:58.350 --> 00:05:01.770 Davis Bennett: So I have strong opinions about multi scale metadata. 33 00:05:02.910 --> 00:05:12.270 Davis Bennett: And, you know, all kinds of image container issues, kind of like sheet microscopy. So my data has no time access now, but I'm axis. 34 00:05:16.950 --> 00:05:17.310 Josh Moore: Here in 35 00:05:18.660 --> 00:05:31.170 Eric Perlman: America work. Previously on in a softball with TM and various large volumes of data. So I care about multi scale for people to access it, and most recently have been working with 36 00:05:32.730 --> 00:05:35.820 Eric Perlman: Jackson Lab on us arranging slides. 37 00:05:37.830 --> 00:05:42.870 Eric Perlman: And yes, I am here for the second session but I figured out, mostly sit silent unfortunately couldn't hide from Josh 38 00:05:47.730 --> 00:05:49.410 Josh Moore: More Eric's yeah 39 00:05:49.560 --> 00:05:54.870 Eric Wait: My name is Eric. Wait, I'm at the advanced imaging center edge anemia and my interest is getting 40 00:05:56.160 --> 00:06:09.390 Eric Wait: Our visitors come in and use our scopes and it's kind of a one off for them. So how can we make data that they can use them at their Institute and maybe make something more cohesive for for people to use 41 00:06:10.440 --> 00:06:14.520 Eric Wait: With us with the event scopes and maybe with their scopes back in their, their home Institute. 42 00:06:17.700 --> 00:06:18.360 Josh Moore: Eric 43 00:06:20.010 --> 00:06:20.820 Erick Ratamero: That's me, I guess. 44 00:06:22.530 --> 00:06:32.250 Erick Ratamero: Yeah, my name is Eric Romero. I work at the Jackson Laboratory, together with the other Eric THAT JUST TALK LIKE TO TIME TO TWO PEOPLE GO. 45 00:06:32.700 --> 00:06:43.260 Erick Ratamero: I've been doing this by image analysis thing for the last three years or so, first in a facility contacts. Now in it kind of contexts. 46 00:06:43.770 --> 00:06:49.950 Erick Ratamero: And jack seems to be like a place with a very specific intersection of people with the capability. 47 00:06:50.700 --> 00:07:08.250 Erick Ratamero: The expertise and the interest and the bandwidth to actually work on questions like, what do we do in terms of file formats for the next generation. So we are interested and trying to take part on this this journey to what comes next. 48 00:07:11.160 --> 00:07:19.140 Erin Diel: Hi everyone, I'm Aaron deal. I'm an application specialist with Glencoe software. So just really interested to hear how everyone's been using the tools and learn about 49 00:07:19.380 --> 00:07:20.250 Erin Diel: What you want from them. 50 00:07:22.950 --> 00:07:26.400 Josh Moore: Never ready for how fast Aaron is so even 51 00:07:28.650 --> 00:07:39.570 Ilan Gold: Hi, my name is Ilan I work with Trevor, who I think will do some stuff later based on the order we're going in the work on this bizarre and basically bringing 52 00:07:40.710 --> 00:07:42.180 Ilan Gold: Your data to the to the web. 53 00:07:43.890 --> 00:07:58.890 Ilan Gold: And sort of its primary forms will use web beyond sort of similar theme to the party to give me like fast editing. So I guess my interest right now is mainly in making sure that you know things are ready for for viewing in the browser. 54 00:08:02.640 --> 00:08:03.090 Jamie 55 00:08:04.980 --> 00:08:12.420 Jamie Sherman: I worked at the Institute for Social Science kind of an engineer scientist role and basically my focus is on 56 00:08:13.230 --> 00:08:23.610 Jamie Sherman: Trying to eliminate barriers for us being able to share our data in a more open format in more open formats and click just getting it out of the proprietary formats that we're dealing with. And 57 00:08:24.390 --> 00:08:41.280 Jamie Sherman: Ideally, like we've been using quilt predominantly for sharing our data, but my major interest in the next gen formats is that it might actually allow us to put our data in the cloud and not have people have to download things because that just really doesn't work so well. 58 00:08:43.200 --> 00:08:44.130 Jamie Sherman: Anyway, thanks. 59 00:08:48.810 --> 00:08:49.260 John 60 00:08:51.750 --> 00:09:04.170 John Bogovic: My name is john book I'm part of this off of leverage and helium. I'm a contributor to image a in Fiji and I've been one of the bigger contributors to end five which is one of the newer block based file formats. 61 00:09:05.970 --> 00:09:18.180 John Bogovic: In this context, I'm mostly interested in, I think, as many tools as possible, be able to open the file formats or file formats that we come across come upon 62 00:09:19.170 --> 00:09:26.670 John Bogovic: Since it's frustrating to see people on the user end having to consistently and 63 00:09:27.180 --> 00:09:36.480 John Bogovic: Almost always receive data from one format to another in order to push it into different tools. So while I don't expect there will be one file format to rule them all. 64 00:09:36.990 --> 00:09:49.560 John Bogovic: Whatever we come up with, I would hope it would be easy for developers to write inputs and outputs right readers and writers for that's one of my invested vested interests. So that's all 65 00:09:49.620 --> 00:09:49.890 Thanks. 66 00:09:51.570 --> 00:09:51.870 Mark 67 00:09:55.470 --> 00:10:01.530 Mark Kittisopikul: Me I'm Marcus awful and currently as associate that a system engineer. 68 00:10:02.580 --> 00:10:10.230 Mark Kittisopikul: I'm mainly attached to the collab with moment. Um, so I kind of inherited the legacy of color. Walk format. 69 00:10:12.240 --> 00:10:16.500 Mark Kittisopikul: And currently fairly interested in how to receive 70 00:10:17.940 --> 00:10:20.220 Mark Kittisopikul: Store away data for legend across the 71 00:10:24.240 --> 00:10:24.660 Next, 72 00:10:25.800 --> 00:10:37.950 Nicholas Sofroniew: Next Friday. If I'm reading the emerging tech team, the chance, like a bag initiative, we're focused on trying to provide or improve access to reproducible quantitative by image analysis. I'm also on the steering Council. 73 00:10:38.310 --> 00:10:50.280 Nicholas Sofroniew: And Nicole contribute to the party project a multi dimensional image view of a Python and that we have a plugin infrastructure to support reading and writing files and I'd want to make sure that 74 00:10:51.270 --> 00:11:06.240 Nicholas Sofroniew: Any file format that you know were able to easily read and write both pixels and metadata, I think, you know, going beyond that. I'm probably also interested in representations that are 75 00:11:06.960 --> 00:11:24.450 Nicholas Sofroniew: Conducive to saving process data to not just roll pixels as well so you know segmentation masks. Even things you know that end up being much smaller in size but still need to be standardized around metadata, you know, shapes polygons, etc, etc. So that's just me. 76 00:11:29.370 --> 00:11:42.720 Nicolas Chiaruttini: Hello everybody. So my name is Nicole me I'm a microscope East and I perform also image analysis for users. You know microscopy facility at EPA fell in Switzerland. 77 00:11:43.590 --> 00:11:55.410 Nicolas Chiaruttini: And as such, we are using a lot of Fiji and you met Jay and so I'm on the Java side and our main issue is to be able to give to users. 78 00:11:56.550 --> 00:12:09.180 Nicolas Chiaruttini: unified way to open the data because we have microscope four monitors Nikon HR everybody all the stuff, and we hope that one day they will come up with a fight for much, which is good. 79 00:12:10.230 --> 00:12:18.420 Nicolas Chiaruttini: Which has a lot of meta data and, in particular, I'm interested into position on meta data. So when you combine different schools. 80 00:12:18.870 --> 00:12:40.440 Nicolas Chiaruttini: A bit like in correlative em, but also in microscope, you know, you have very big over of us, and then your sport or detail in some parts why I'm here. I'm really interested into combining also also in three dimensional space with different resolution. So that's my main focus. Cheers. Susanna 81 00:12:41.250 --> 00:12:47.970 Susanne Kunis: That's me. Hello, I'm Susanna coolness from us network from the university and also my book and 82 00:12:49.230 --> 00:13:12.750 Susanne Kunis: My focus is more meta data and capturing of meta data from Microsoft data and linked the data, but we working also on how we can share data and the base presentation that we in different facilities and so I'm happy to observe this discussion, how we can make this change. Thank you. 83 00:13:16.590 --> 00:13:28.230 Trevor Manz (he/him/his): Everyone I'm Trevor has been mentioned I were mostly a consumer of a lot of the hard work from the only community on these next generation file formats and trying to make sure that 84 00:13:29.490 --> 00:13:33.780 Trevor Manz (he/him/his): These cloud friendly format we can access in the in the web browser as well so 85 00:13:34.650 --> 00:13:49.500 Trevor Manz (he/him/his): I've been working on this project, which is sort of the web based viewer for these types of pixel data and then also GA S which is a JavaScript implementation star, which sort of factories intelligent things and also next generation software. 86 00:13:53.910 --> 00:13:54.270 Josh Moore: There we go. 87 00:13:56.190 --> 00:13:57.660 Ulrike Boehm: Sorry. Hello everyone. 88 00:13:58.710 --> 00:14:10.290 Ulrike Boehm: Again, I'm unfortunately right now in the lab. No, not unfortunately actually quite good that I can be in a lab. But anyway, so my name is already kaboom. I'm working at the advanced imaging center at omnia currently I'm with 89 00:14:11.190 --> 00:14:17.820 Ulrike Boehm: The multi focal instrument, but anyway. So at the beginning of the year, I was actually participating 90 00:14:18.540 --> 00:14:24.870 Ulrike Boehm: In a conference called software for microscopy was, which was mainly focused on acquisition acquisition software. 91 00:14:25.290 --> 00:14:34.380 Ulrike Boehm: But one of the major parts was there. Also the discussion about good data formats and as I can still recall from the discussion. 92 00:14:34.980 --> 00:14:46.080 Ulrike Boehm: People were already kind of envisioning the next type of data format and people were arguing about which data format is a good one. But what I think currently what we might need are 93 00:14:46.530 --> 00:14:55.020 Ulrike Boehm: Really, how can I say standards, also for the people who develop upcoming data formats to make sure that they actually fit into what's currently 94 00:14:56.010 --> 00:15:12.900 Ulrike Boehm: The norm, because I think Jason can maybe say they're a little bit more about but for example buyer formats. I think they adjust the they are software currently such that lots of data formats can be used. But every time when a new data form, it comes to the 95 00:15:13.950 --> 00:15:31.320 Ulrike Boehm: Data from. It comes to the market. This has to be changed again. And I think this is a hassle for lots of people in a thing. If I think we, if we have a standard and place it might be also easier to to integrate all of these and to make the life for everyone kind of a bit 96 00:15:32.520 --> 00:15:39.270 Ulrike Boehm: Easy. I just only briefly have to look at my notes if I forgot something. I am mainly because also 97 00:15:39.930 --> 00:15:40.830 Josh Moore: Just have more time. 98 00:15:41.730 --> 00:15:55.830 Ulrike Boehm: Okay. One thing was also very important. I mean, currently I'm also part of the corporate initiative, which is currently focused mainly on capturing standards for microscopes mainly hardware standards to guarantee that 99 00:15:56.880 --> 00:16:04.980 Ulrike Boehm: Everything is nicely in line. But to my mind hardware standards shouldn't be the only stuff. That's right. I think it would be actually be great if even 100 00:16:05.400 --> 00:16:18.120 Ulrike Boehm: This community here when we talk about data formats can be also kind of linked to also the other initiatives which were already kind of currently going on because I think right now there's lots of momentum in the community. 101 00:16:18.840 --> 00:16:26.220 Ulrike Boehm: And we really should kind of try to really get as make sure that as many voices as possible are being heard. 102 00:16:27.300 --> 00:16:28.650 Ulrike Boehm: And yeah. 103 00:16:28.710 --> 00:16:36.180 Josh Moore: I think we all agree with you. So actually, we should probably just take that recording and, you know, send that around to people, that's exactly what we all want to say were completely behind you. 104 00:16:36.750 --> 00:16:48.450 Josh Moore: Um, and maybe we can get a topic listed. If you know if there's more we can do about making that happen. We'll get back to it but just we had a couple of people who showed up. So getting through. It was Blair showed up and then Allah 105 00:16:50.250 --> 00:16:58.860 Blair Rossetti (Janelia): Hello, my name is players, Eddie. I work with Baraka, and with their weight at the AC and happy to see many familiar faces and hear what's going on. 106 00:17:01.620 --> 00:17:04.170 Ola Tarkowska: I am a lot. I work in sanker 107 00:17:05.190 --> 00:17:09.810 So my interest is basically in microscopy platform and coming 108 00:17:12.900 --> 00:17:18.450 Josh Moore: Cool, so will rush through the Dundee team. Plus, Melissa, Jason, you want to kick it off. 109 00:17:19.710 --> 00:17:31.500 Jason Swedlow: Sure. Hi everybody I'm receive any of you. Again, my name is Jason. Hello, I'm at the University of Dundee work with the army team. My role there is more or less to keep the money rolling in, and 110 00:17:32.610 --> 00:17:35.640 Jason Swedlow: Try to Yeah keep everything on track. 111 00:17:39.000 --> 00:17:39.570 Josh Moore: Do you go 112 00:17:42.150 --> 00:17:50.340 David Gault: Hi everybody. So I'm David Goltz have a developer on your meeting primarily working on bio formats and all native 113 00:17:54.390 --> 00:18:01.770 Petr Walczysko: Hi, my name is Peter I DO WITH YOU. ME TO mainly the outreach is and quality assurance. 114 00:18:03.930 --> 00:18:16.230 Sebastien Besson: And Sebastian, I'm working for with me to be mostly interested in the all everything from at related from by former to imitative and, more recently, heavily involved in the image data resource. 115 00:18:17.760 --> 00:18:31.200 Simon Li: I am certainly one of the mirror developers and also one of the main idea, our developers and sis admins. So I've seen it was so poor data that's coming into the idea that hopefully benefit from the performance discussion we're having now. 116 00:18:33.660 --> 00:18:40.170 Will Moore: I am will I'm mostly work on the web viewing side of a mirror. 117 00:18:42.270 --> 00:18:49.380 Will Moore: More recently just starting to look at how to show me the data and the clients in the web and in an apartment. 118 00:18:52.620 --> 00:19:00.480 Melissa Linkert: And I'm awesome. I worked at Bank of software and collaborate with Ami team and I work on pretty much anything you can think of that relates to file formats. 119 00:19:03.630 --> 00:19:18.000 Josh Moore: Cool, thank you everyone. So we're 28 people and we got done in 23 minutes. So on average, big golden stars. Um, so how do we want to proceed. So the 120 00:19:19.350 --> 00:19:23.280 Josh Moore: Working backwards. There's there. There are no new topics. 121 00:19:24.450 --> 00:19:34.110 Josh Moore: And there's a few people who have added interests. So this might be a good time for everyone to take a second and think about what they'd like to talk about, um, 122 00:19:35.250 --> 00:19:39.840 Josh Moore: I have my list of what everyone mentioned, so I'll probably bring those up, if no one else says anything 123 00:19:41.160 --> 00:19:50.100 Josh Moore: Um, and then I suggest we let everyone get a chance to ask questions about the couple of videos that were posted. So how many people actually watch the videos. 124 00:19:51.990 --> 00:19:58.320 Josh Moore: Oh, that's actually much better than the morning crew so you guys had more time. Awesome. So I'll assume there are a couple of questions on the videos. 125 00:20:00.450 --> 00:20:09.900 Josh Moore: And we'll get to that in just a second. But first, before anything else happens to anyone. Is anyone lost or has anything been mentioned that no one has an idea about what it is that we need to cover 126 00:20:10.770 --> 00:20:17.520 Josh Moore: Someone actually didn't need a favor this morning of asking what is in GFS and so we could actually start off with that. So it's, um, 127 00:20:21.930 --> 00:20:22.650 Josh Moore: Who unmuted. 128 00:20:25.170 --> 00:20:32.520 Josh Moore: Okay, so I'm going to assume everyone. Good. You know what's going on, you've watched the videos you're ready to dig into this and and hopefully make some pretty 129 00:20:33.840 --> 00:20:35.160 Josh Moore: Substantial decisions. 130 00:20:36.990 --> 00:20:43.080 Josh Moore: So they were cool keep out of your names. Okay, so there are the four videos so 131 00:20:44.850 --> 00:20:50.310 Josh Moore: We'll and Trevor did a fairly good job. I think of showing the state of the latest latest specs. 132 00:20:51.180 --> 00:21:00.060 Josh Moore: So, so, you know, there are three specs that we've worked on as kind of a community to have those have been posted to image SC and we're still looking for comments. 133 00:21:00.480 --> 00:21:09.660 Josh Moore: I'm in the morning session. We certainly talked about places where those need to be adjusted and then the most recent spec that we've been working on this for high content screening and that's what will show 134 00:21:11.640 --> 00:21:12.840 Josh Moore: Any immediate thoughts. 135 00:21:20.820 --> 00:21:31.350 Damir Sudar: Traditional one. One thing that struck me a little bit is that the respects, of course, tie all together. But there's a kind of a, an overarching statement that says 136 00:21:32.550 --> 00:21:42.420 Damir Sudar: What sub components are needed and and what set what what makes some of them more urgent or to go first. 137 00:21:42.510 --> 00:21:43.650 Damir Sudar: And we 138 00:21:44.520 --> 00:21:48.930 Damir Sudar: Know, kind of like a wish list and then a urgency or 139 00:21:51.000 --> 00:22:05.160 Damir Sudar: Needed needed soon annotation to each of those those things so that link and also as a community can add things that we think are important and others can chime in on. Yeah, yeah. I need that to or mad. I don't care. 140 00:22:06.240 --> 00:22:17.520 Josh Moore: Yeah, so it's something we've held off on to some extent is just defining a central repository for all the specifications and at the moment you you're more than welcome to open so or 141 00:22:18.450 --> 00:22:26.640 Josh Moore: The Tories link the document is currently putting yes open issue against any of those saying I need whatever 142 00:22:27.210 --> 00:22:31.680 Josh Moore: You know, that's where the the conversation certainly could take place, or it could take place on image se 143 00:22:32.280 --> 00:22:47.040 Josh Moore: Um, but the point is very taken. So certainly, the content from this morning. So there were, I think, three or four clear kind of request for a recommendation like specification of polygons, um, 144 00:22:48.090 --> 00:22:49.680 Josh Moore: I guess I can just go, look what they were. 145 00:22:52.380 --> 00:23:00.900 Josh Moore: Gail and origin information on the multi scales that we we put on hold, from the multi scale specification, all of that. 146 00:23:01.860 --> 00:23:06.210 Josh Moore: Compression. All of that will need to show up as individual issues and so 147 00:23:06.840 --> 00:23:19.770 Josh Moore: I'm part of what we'll do, at the very end of this is just talk about how do we keep having these conversations, you know, is it all on GitHub. Is it all a repository. Are we doing it all on image SC, when are we having meetings, things like oh 148 00:23:22.470 --> 00:23:36.690 Josh Moore: As I said in my video certainly throughout the summer, we kind of heads down and just focused on getting a couple of things done. But now with this meeting. It's all about finding out ways to keep the conversation going and make sure that everyone can can get themselves heard 149 00:23:38.460 --> 00:23:53.010 Josh Moore: The flip side of that, though, is it will almost certainly be necessary if we have the community raising issues saying this is vital that we also have more people implementing those issues. So that'll be what I'm requesting right so just be ready for that. 150 00:23:54.900 --> 00:23:55.470 Josh Moore: Anyone else 151 00:24:04.620 --> 00:24:09.990 Josh Moore: Okay so everyone's good on the state of the videos. Everyone give it a try and download anything 152 00:24:13.110 --> 00:24:19.350 Josh Moore: Yeah, we thought about doing the whole live demo thing that's always it's always a question. Um, 153 00:24:21.240 --> 00:24:22.950 Josh Moore: Okay, so in addition to the things 154 00:24:24.330 --> 00:24:34.380 Josh Moore: I can't tell if everyone's happy or you're just quiet, um, for me it sounds like the two biggest things that we have to probably address. And it was quite different from this morning. 155 00:24:37.140 --> 00:24:46.080 Josh Moore: Our basically comes of the software from microscopy paper. So it's actually something I've been wanting to engage with for a while, so I don't know who all was there. 156 00:24:46.620 --> 00:24:53.790 Josh Moore: Um, but I can certainly Express. I have a very different reading of of how this is going to work. And I think the Geneva your crew. 157 00:24:55.350 --> 00:25:11.550 Josh Moore: Are certainly talking to private ish after the meeting chef on private, um, there's this idea that we can do everything with an API. And I think that's one of the big decisions we have to make is, is it an API or is it a file format, um, 158 00:25:12.870 --> 00:25:16.050 Josh Moore: So that's certainly something we can dig into if anyone's interested 159 00:25:18.720 --> 00:25:25.350 Josh Moore: The process data. So can talk about the state of the little images as they stand. One of the requests from this morning. 160 00:25:25.800 --> 00:25:35.220 Josh Moore: Was what how to go about adding metadata on to the current specifications, you know, how do you add the class name for each of the the labels in an image. 161 00:25:35.700 --> 00:25:54.360 Josh Moore: And that kind of went even further in terms of can we add you know if you have vertices that are stored in the in the format. Can you add information to each of the vertices basically at the highest level, the most generic request is can we have tabular data in the format, um, 162 00:25:55.890 --> 00:26:08.040 Josh Moore: And eventually we will be able to, there's actually a problem, supporting it in the same way between czar and in five. So that's something we can always use help with making sure that we shouldn't in the same way. 163 00:26:12.030 --> 00:26:16.950 Josh Moore: And then there's the larger meta data questions which I guess is listed here. Okay. 164 00:26:18.870 --> 00:26:20.070 Josh Moore: I mean, it looks like at the moment. 165 00:26:22.200 --> 00:26:24.810 Josh Moore: It's either multi scale or copying data. 166 00:26:27.870 --> 00:26:31.290 Josh Moore: I think copying data can kind of 167 00:26:32.520 --> 00:26:34.050 Josh Moore: Way, first we want to start there. 168 00:26:36.000 --> 00:26:37.020 Josh Moore: Mark, you want to say something. 169 00:26:37.860 --> 00:26:42.870 Mark Kittisopikul: Sure. So I think the broader question. Well, the maybe the 170 00:26:44.400 --> 00:26:55.230 Mark Kittisopikul: impetus for for that copying data is I think a lot of us are acquiring raw data from microscopes, or various devices and we have to do with somewhere. 171 00:26:55.770 --> 00:27:10.050 Mark Kittisopikul: And then we have to start analyzing or processing it. And so this is starting to become a really big problem is one usually coming up from microscope, you got it in one format either defined by the control software or by the camera manufacturer 172 00:27:11.730 --> 00:27:30.900 Mark Kittisopikul: Or, you know, however, someone brings it together. And then there's a need to either reformat it into something like we're talking about now or or maybe move it somewhere where it's more accessible. And so this is starting to become increasingly type consuming process that both inhibits 173 00:27:31.050 --> 00:27:32.700 Mark Kittisopikul: Us of the instrument, but 174 00:27:33.750 --> 00:27:41.580 Mark Kittisopikul: Also, it just makes it a challenge in terms of being able to get it to the people who need it. 175 00:27:45.510 --> 00:27:55.350 Mark Kittisopikul: And so I think maybe one one caution about kind of coming up with a new format is, you know, now we're adding get one more format. 176 00:27:56.460 --> 00:28:00.060 Mark Kittisopikul: To the whole list of things and making sure that we're not kind of 177 00:28:00.150 --> 00:28:00.690 Josh Moore: Making it 178 00:28:00.720 --> 00:28:03.600 Mark Kittisopikul: more onerous by requiring yet another copy of that do 179 00:28:07.320 --> 00:28:08.310 Josh Moore: You want to add on to that. 180 00:28:09.480 --> 00:28:18.690 Davis Bennett: I mean, the issue that's particular to the trunk formats is that if you have any overhead associated with moving one thing on the file system to another as 181 00:28:19.410 --> 00:28:26.580 Davis Bennett: Soon as you proliferate, the number of things on the file system through Chungking that starts to be an issue that wasn't an issue for tip for HDFS 182 00:28:29.070 --> 00:28:29.400 Davis Bennett: Oh, 183 00:28:30.420 --> 00:28:43.620 Davis Bennett: My experience has been. I mean, I'm, I'm using one like czar. By default, puts all the chunks in a single folder. If you try to open that folder and as a million chunks. You can go get a coffee or something. Is this good 184 00:28:45.540 --> 00:28:48.000 Davis Bennett: Oh czar also supports the nested directory store. 185 00:28:50.460 --> 00:29:02.220 Davis Bennett: And if you want to copy a bunch of things then parallelism is your friend. Both not necessarily obvious to people who are used to a format that doesn't fill the file system but little things. 186 00:29:03.510 --> 00:29:11.250 Davis Bennett: And then another perspective is, if you want to sacrifice. This could be an extension to end fibers are if you're willing to sacrifice atomic rights. 187 00:29:11.790 --> 00:29:20.790 Davis Bennett: Then you could fuse trunks together and make a version of a data set. And this is essentially what neuro Glanzer the web based volume viewer does 188 00:29:21.630 --> 00:29:32.070 Davis Bennett: Dietary format data are stored in chunks. But then there's a compaction step the trunks are there and you have 10 files, each of which is pretty big. But it's easy to copy 189 00:29:33.660 --> 00:29:35.490 Davis Bennett: All of these, sorry. 190 00:29:36.630 --> 00:29:46.080 Eric Perlman: As a yeah Trevor has done some very cool work on this plane with. Oh, I mean, typically find more optimal formats to use, but it's been cool to follow. I think you've been on those sites as well. 191 00:29:48.630 --> 00:30:00.060 Davis Bennett: The I think all of this falls under the new territory and people maybe need to learn new tricks like for me, deleting big containers took a long time so I wrote some code to paralyze it and now it's not an issue. 192 00:30:02.430 --> 00:30:07.680 Davis Bennett: Modern computers have enough cores that I think that just some coding can get around some of these issues. 193 00:30:10.650 --> 00:30:14.610 Bill Katz: It seems like one of the things that this kind of brings up 194 00:30:15.150 --> 00:30:27.750 Bill Katz: Is the notion that with the single you know chump profile versus, you know, the Uncharted versus the shorted representation for example in neuro Glanzer and also immutable versus mutable. 195 00:30:28.350 --> 00:30:36.450 Bill Katz: Because I think those distinctions mean a lot in terms of share ability. How much data needs to be transformed. 196 00:30:37.980 --> 00:30:45.060 Bill Katz: And also raises the question to of API versus file format because if you're doing an API. 197 00:30:45.510 --> 00:30:59.280 Bill Katz: And data engine. You can do a lot of transformation before you actually get the data, whereas if you're assuming that you're reaching directly into a chunk and need to interpret it immediately the file format. 198 00:30:59.880 --> 00:31:08.490 Bill Katz: Then um you know that that limits your options in adding things like transactions burgeoning etc. 199 00:31:15.360 --> 00:31:29.520 Blair Rossetti (Janelia): To Davis this point as well. A we're always keeping the user perspective in mind because while it might be easier for us to write some code to delete this stuff in parallel visitors undoubtedly will just try and click and drag stuff. 200 00:31:31.350 --> 00:31:37.500 Blair Rossetti (Janelia): And anything that slows them down is going to be a headache for us because they'll complain about the past and future. 201 00:31:38.550 --> 00:31:47.490 Blair Rossetti (Janelia): So I think mechanisms that are non programmer friendly are going to be critical, at least from our niche. 202 00:31:51.840 --> 00:31:59.610 Jamie Sherman: From my perspective, and I come from a bit of an outside space and from a mass spec background. Originally, is that 203 00:32:00.450 --> 00:32:10.470 Jamie Sherman: Like the formats that are created by the instrument manufacturers are entirely designed around dumping data to disk as fast as possible because acquisition speed as a selling point 204 00:32:11.190 --> 00:32:22.260 Jamie Sherman: And so they're never going to be optimized for viewing and they never liked having worked for a mass spec campaign. They never spend. They're not interested in spending time 205 00:32:22.890 --> 00:32:37.620 Jamie Sherman: Optimizing a viewer for you eat like a file format for using it. It's purely for acquisition. But so, like, so I think that there is a valid role for creating quote unquote yet another file format. 206 00:32:39.960 --> 00:32:49.020 Jamie Sherman: It, assuming it gives us a user and that we want in from my like the one thing I would love to see or what gets me excited about czar and these other formats. 207 00:32:50.280 --> 00:32:51.450 Jamie Sherman: That are chunked is that 208 00:32:52.470 --> 00:32:58.680 Jamie Sherman: I really liked the idea of being able to put the data like we're trying to share data right and we're trying to share it in an open format. 209 00:32:59.580 --> 00:33:18.360 Jamie Sherman: And if we can put it on, you know, Google or AWS or whatever cloud server and let people actually browse the data in that context that that's incredibly valuable because currently we're already sharing quite a bit of a fair few terabytes of data and 210 00:33:20.280 --> 00:33:29.340 Jamie Sherman: And the challenge we have is that nobody's going to go there and download a couple terabytes of data to browse through it because you know storage things, etc. 211 00:33:30.420 --> 00:33:44.520 Jamie Sherman: And if the if the file format. Sorry part dynamic and we can use you know FF FSS back or whatever to pull the data from, you know, and interact with it through a browser, like a party or whatever. 212 00:33:45.720 --> 00:33:59.970 Jamie Sherman: Then that makes the data, much more usable. It makes it discoverable from experimental his perspective, I think we're. It sounds like there are two kind of target markets here one that's kind of the website and the other. That's more of the 213 00:34:02.370 --> 00:34:06.870 Jamie Sherman: Like the user in a lab who has who's just dealing with the file as an entity. 214 00:34:08.070 --> 00:34:13.830 Jamie Sherman: And so I guess breaking things into those perspectives might be helpful. Anyway, thanks for listening. 215 00:34:15.570 --> 00:34:17.370 John Bogovic: But I have some related points. 216 00:34:20.490 --> 00:34:27.900 John Bogovic: I think, I think those are great points. And related to that, um, my feeling is that 217 00:34:28.920 --> 00:34:42.060 John Bogovic: Hot wallet copying data is important. I think I would like to, I don't know if educators, the right word, but I think users should avoid copying just copying data as much as possible. 218 00:34:42.570 --> 00:34:46.500 John Bogovic: And especially when data gets large enough 219 00:34:47.100 --> 00:34:57.930 John Bogovic: And to that end, I guess I would hope that there is a regime in which these junk file formats just shouldn't be used, that is small to medium sized data just shouldn't you shouldn't have a million chunks of a 220 00:34:58.350 --> 00:35:09.630 John Bogovic: Data set that is small enough, whatever that means. So, I mean, this is I think truths are and probably true been five that there might be a regime in which 221 00:35:10.560 --> 00:35:22.980 John Bogovic: It's chunked into just one chunk and that's it. Is that reasonable. I'm curious what people think about that idea and that is chunk data sets should have a fallback to an unranked 222 00:35:24.030 --> 00:35:25.890 John Bogovic: version that is still viable. 223 00:35:28.710 --> 00:35:36.120 Ilan Gold: I just want to comment that I something I ran into maybe a couple months ago when we were we, I work on the hub map portal and something 224 00:35:36.570 --> 00:35:44.760 Ilan Gold: I ran into when developing our tools. Was that one of the things you can't do a czar that you can do with like a single files. You can't just like download it in your browser. Are you you need 225 00:35:45.060 --> 00:35:51.120 Ilan Gold: Need like utilities to get at that folder structure, I think. I don't think you can kind of just W get as our store. 226 00:35:51.450 --> 00:35:59.130 Ilan Gold: Unless it's, I think there's maybe like a way to do it. But I don't think that, as far as I understand there's not like a way to just W. Get it like you need a command line utility 227 00:35:59.430 --> 00:36:09.000 Ilan Gold: For a cloud provider GS you till or AWS S3 CP so like that. That's maybe another another case where like having both file formats stored be useful. 228 00:36:09.390 --> 00:36:15.510 Davis Bennett: So we'd have to be, I think, fundamental issue is that the metadata is stored in a separate object from the data. 229 00:36:16.590 --> 00:36:20.430 Davis Bennett: Unless you sip it together. There's not one thing you could get that has the whole story. 230 00:36:20.760 --> 00:36:23.130 Ilan Gold: Right, so I brought this up specifically because you said like, oh, 231 00:36:23.160 --> 00:36:34.320 Ilan Gold: You know, it's, it's obviously like possible to do this, but to non computer like if it requires like a little bit of computer science here to be able to like download this utility or use the utility, whereas some users, maybe don't want that. So, 232 00:36:36.750 --> 00:36:39.390 Josh Moore: So I have a couple of thoughts, they're not 233 00:36:41.130 --> 00:36:53.880 Josh Moore: Exactly well structured on this from the only me side I'm certainly and what we did just in the experiments we've done so far with SAR um it's a huge benefit to be able to say 234 00:36:55.290 --> 00:37:07.860 Josh Moore: This thing is is a morphic wherever we put it. So we have the images locally. We work with it one way, the code works identically to if it's remote obviously their performance impacts, but um 235 00:37:08.880 --> 00:37:18.030 Josh Moore: And I actually think with this specification. Well, I think a goal of any specifications, we come up with should also then apply to other formats. So 236 00:37:18.450 --> 00:37:29.640 Josh Moore: You know, in the regime where you would rather have one file use HTML5, you know, that's fine. It's, it's, it's a good format and it you know it has its limits. But all of these things do. It's a trade off. 237 00:37:30.240 --> 00:37:41.100 Josh Moore: Um, I think it would be so it, it's still useful for us to have these conversations about these various conventions, like other communities have done so I'm thinking of the geospatial community where 238 00:37:41.490 --> 00:37:57.360 Josh Moore: There's an entire process where they go through saying here's how we grow and how we're going to represent maps and they've gone through exactly this process. There was net CDF for many years and their data set and that CDF which is HTML5 and now they're moving to czar and free 239 00:37:58.410 --> 00:38:10.170 Josh Moore: Own. Um, and now you know the underlying that CDF library is being taught how to work with czar. So, the two are completely perfect so you know that you're getting everything out of a piece of data. 240 00:38:10.980 --> 00:38:20.700 Josh Moore: Regardless of what store. It's sitting in you know so i i think there's an abstraction here that we can we can use to stop worrying, a little bit about 241 00:38:21.240 --> 00:38:26.370 Josh Moore: The specifics. You know, we want to enable for the user that they can choose one or the other, um, 242 00:38:27.000 --> 00:38:37.560 Josh Moore: But if it's but if you're sure. So what's missing at the moment from bio format is you don't know this, you know, we don't know from the proprietary file formats. If we have absolutely everything is it structured properly. 243 00:38:37.980 --> 00:38:43.470 Josh Moore: But if we've made these specifications together as a community, we can say we want to do this in HDFS 244 00:38:44.190 --> 00:38:55.020 Josh Moore: We could choose a sub sub format of each of HDFS we could choose PTV or Mrs and say we're going to do the single file you case in this container. 245 00:38:55.560 --> 00:39:10.980 Josh Moore: For the cloud use case, you know, we're working to have czar and in five, the same underlying specification. Okay, that needs to be some somehow unified and then there will be one cloud based, you know, small chunk based format and it's all the same. 246 00:39:12.060 --> 00:39:22.020 Josh Moore: So I think, um, it's an involved way of saying, you know, try to keep the number of formats as small as possible, which is what a lot of you mentioned and you know in your intro so 247 00:39:23.010 --> 00:39:34.620 Josh Moore: Maybe it's a question about how to go about doing that if I roll back even farther and start to think about the API versus file format. I guess I have. And maybe it's a naive. 248 00:39:35.310 --> 00:39:52.890 Josh Moore: Hope that we can we can have a format that that is this is amorphous and that we can put anywhere we can store it in any way. And then we start to build up or they expend extensions to the specification that allow us to do some of the more complex API like 249 00:39:54.510 --> 00:40:00.300 Josh Moore: indexing of the data, you know, there are things that are beyond what a file format can do. Um, 250 00:40:00.960 --> 00:40:14.340 Josh Moore: I guess just if if I go to a service. I would like that service to still provide me the base underlying specifications that we've all agreed on, you know, in the, in the simplest way to get it now i w get the entire data. 251 00:40:15.840 --> 00:40:24.000 Josh Moore: But if you're on top of it, by all means, I, why not, right, it may be that some some implementations don't 252 00:40:25.080 --> 00:40:34.710 Josh Moore: Support all of the indexing and I think this is probably a lot. I mean, I don't know much about neuro glance right basically know what Eric has taught me. Um, but I think 253 00:40:35.940 --> 00:40:46.590 Josh Moore: You know, I think all of its functionality can be basically considered an extension to the spec that some communities may need and then we kind of get the best of both worlds. Okay, that was my spiel. 254 00:40:51.390 --> 00:40:51.750 I mean, 255 00:40:57.510 --> 00:41:08.520 Davis Bennett: I would just like I'm used to. I was used to using image VIEWERS LIKE Fiji or image j, then, as a user, you need to think about the file format of your data. 256 00:41:09.480 --> 00:41:20.820 Davis Bennett: Because you need to know what's in the file system. You need to know that can image a read my TIFF file is Jay for a while 64 bit floats for a while. So this was always a headache for me. 257 00:41:22.650 --> 00:41:41.550 Davis Bennett: But now that I'm working a lot with neuro Glanzer it's extremely refreshing then answer has no idea what format format my data is in. And I think a lot of users would actually prefer to be in that situation and never need to care what file format, the data is actually stored 258 00:41:42.630 --> 00:41:50.280 Davis Bennett: They just know where is the data and can my tool stand what data is at that place. 259 00:41:51.480 --> 00:42:00.630 Davis Bennett: So, this I think my experience working with image as an API or image as a service has been it's felt way nicer than image as a form 260 00:42:02.010 --> 00:42:10.560 Davis Bennett: Oh, if the visualization tools will approach. But if Fiji, for example, would stop having an opinion about, you know, different tips. 261 00:42:10.980 --> 00:42:24.750 Davis Bennett: Would delegate actually handling the file format to some other service and would just focus on rendering images that it gets from who knows where I think a lot of these problems spike obviously pushed maybe somewhere else. 262 00:42:27.210 --> 00:42:27.540 Davis Bennett: I 263 00:42:28.380 --> 00:42:30.600 Josh Moore: Still think some of the conversations we're having. We're still gonna 264 00:42:31.560 --> 00:42:33.330 Josh Moore: We're still going to need to have, but 265 00:42:34.800 --> 00:42:35.940 Josh Moore: I don't disagree with you at all. 266 00:42:37.980 --> 00:42:38.850 Josh Moore: Veteran and you have your hand up. 267 00:42:43.560 --> 00:42:45.210 Caterina Strambio: Yes, primary a question. 268 00:42:46.560 --> 00:42:52.290 Caterina Strambio: So the idea is you know I think minimal number of formats. I'm 269 00:42:54.030 --> 00:43:05.220 Caterina Strambio: Definitely obviously very much on board. I'm just thinking, I don't know if we have addressed this issue or if we need to address it here. How do we get from 270 00:43:05.850 --> 00:43:27.300 Caterina Strambio: Here to there because I mean we we know that the manufacturers of the soft of the microscopes and the software that goes with that microscope will not. We want to continue to produce their file formats, so that they in maximizes as as Jamie was saying. 271 00:43:28.560 --> 00:43:29.820 Caterina Strambio: Writing on disk. 272 00:43:32.430 --> 00:43:34.560 Caterina Strambio: It is the idea that we're just gonna 273 00:43:35.970 --> 00:43:48.990 Caterina Strambio: Come up with a great solution. We're going to ask them to participate, a conversation and then present them with a fit with, you know, like a fait accompli like this is what we want to do and come up with it or 274 00:43:50.010 --> 00:44:01.560 Caterina Strambio: Which I mean I don't have any anything in principle against it. It's just, I'm just thinking, what are we, do we need to talk about how we're going to get there in terms of what manufacturers will do 275 00:44:02.250 --> 00:44:07.590 Ulrike Boehm: Think they should be invited as well, right, to my mind, like everyone who kind of works to 276 00:44:08.250 --> 00:44:20.430 Ulrike Boehm: In a slide way on data formats. Also, the developers of current data formats like n five and czar and so on and so forth. They need to be part of this discussion as well because they are actually the people who 277 00:44:20.910 --> 00:44:29.460 Ulrike Boehm: Also know bit more about the visibility whether this type of data formidable how it has has been created is as flexible to fulfill the demands right 278 00:44:30.330 --> 00:44:32.040 Caterina Strambio: Well, the developers of the 279 00:44:32.700 --> 00:44:38.790 Caterina Strambio: ends are and stop their part of it. The question is more like the manufacturer. So just clubs. Anyway, I'll shut up. 280 00:44:41.520 --> 00:44:52.830 Davis Bennett: Is it the many years, at least for light microscopes, they've had the opportunity for years to at least be on a standard between manufacturers. So, as far as I know, it's ice and Olympus. 281 00:44:53.940 --> 00:45:00.300 Davis Bennett: I'll use different formats. So they actually cared about providing something that would be convenient for users, they would have done it. 282 00:45:03.450 --> 00:45:04.860 Josh Moore: So I guess I want to give them the 283 00:45:04.890 --> 00:45:07.950 Josh Moore: Benefit of the doubt. So we've had multiple conversations 284 00:45:08.370 --> 00:45:17.280 Josh Moore: So Stephen can relate Wagner from Zeiss like no Conrad was on the session this morning, you can kind of see his statements, um, 285 00:45:18.720 --> 00:45:27.480 Josh Moore: There were, you know, a handful of vendors five or six at the European bio imaging industry board meeting, I guess, at the end of 2018 286 00:45:28.140 --> 00:45:39.720 Josh Moore: So it's been better than ever before talking to vendors and hearing them say, okay, we could see possibly having a developer work with you guys to develop something we could see 287 00:45:40.710 --> 00:45:50.730 Josh Moore: And some of the statements are well if there's a good C or c++ library then ok we'll, we'll consider where you know we're we're making baby steps towards this. And I guess that's what Catarina is saying as well. 288 00:45:52.080 --> 00:45:58.980 Josh Moore: You know, from our point of view, what, what we can see doing is just trying to get to the point where there's a solid 289 00:45:59.880 --> 00:46:10.020 Josh Moore: Specification solid libraries arm and hope to sell it to them, you know, that's, I don't know anything else to do. It sounds like there's additionally 290 00:46:10.980 --> 00:46:22.290 Josh Moore: The API conversation on top of that, which I think is still interesting I don't exactly know what the vendors will hell, they will see their interaction with the API as opposed to a file format, you know, 291 00:46:22.890 --> 00:46:30.240 Josh Moore: I don't disagree with what you guys are saying, you know, we're very, very file format based and just we've been forced to be for for many years. 292 00:46:31.500 --> 00:46:43.290 Josh Moore: Maybe there's a way to free ourselves to look even higher for a better solution. I still think it will revolve around good implementations good specifications and lots of conversations 293 00:46:45.120 --> 00:46:54.570 Josh Moore: You know, I just assume if we start to look at API's. For example, there will be, you know, there will be things that other communities are going to need to add into that API. And it to the extent that that's doable. 294 00:46:54.780 --> 00:47:09.630 Josh Moore: You know, all on board, but it's still the same process and and maybe we redefine what n g f f is, but it's still the same process of this community and as many other communities as want to partake participate completely agree with Rica, you know, everyone should be on board. 295 00:47:11.310 --> 00:47:21.180 Josh Moore: Of making sure you know we have a process to go if we don't agree if the API or the file format is slightly different in two places. How do we get them unified 296 00:47:22.800 --> 00:47:35.100 Josh Moore: Basically I, you know, Mia culpa I've mostly focused on doing that with czar and in five to date. You know we got money to get someone to work on it. They're working on it, it should be done by the end of the year. I'll cross my fingers. 297 00:47:36.330 --> 00:47:49.950 Josh Moore: And then we move on to the next problem, right. So if any of you know, you know, answering or trying to answer Catarina first question, if you know the next thing we really need to tackle to make this happen. You know, let's do it. That's what this meeting is all about. 298 00:47:50.550 --> 00:47:50.790 Bill Katz: I think 299 00:47:50.820 --> 00:47:52.020 Josh Moore: All of one, sorry. Go ahead. 300 00:47:52.950 --> 00:47:55.650 Bill Katz: So from, from my perspective, it seems like 301 00:47:57.000 --> 00:48:04.890 Bill Katz: You know, I see a lot of the discussion is primarily focused around dense data that we kind of like that makes a lot of sense. 302 00:48:05.310 --> 00:48:18.570 Bill Katz: Because 2D, 3D and 2D image data will be dense and so when you're doing physical partitioning to do like in five and SAR that also makes sense because each of the files are relatively similar size. 303 00:48:19.080 --> 00:48:28.410 Bill Katz: One of the things that I really liked about tile DB, for example, is that they completely differentiate between the physical partitioning and the case for 304 00:48:28.740 --> 00:48:29.640 Bill Katz: sparse data. 305 00:48:30.180 --> 00:48:40.410 Bill Katz: For things that you don't actually you can't easily physically partition, because then you, you want to do it partitioning more based on capacity. 306 00:48:41.400 --> 00:48:53.310 Bill Katz: And so I think one of the issues that I see with trying to standardize a file format is you have to know if the file format would be suitable for the different kinds of use cases. 307 00:48:53.790 --> 00:49:04.440 Bill Katz: And it's not clear to me, for example, that you could get a file format that supports both dense and sparse. I believe in the tile DB case, for example, they have two different file. 308 00:49:05.220 --> 00:49:18.990 Bill Katz: formats for those two different cases. And I think that they're trying to create a universal data format and they think that the dense and the sparse case handles everything so that they can build data frames on top of it, etc. 309 00:49:19.440 --> 00:49:28.920 Bill Katz: But I think one way to approach this might be saying, Okay, what is the minimal set of sort of characteristics of data that we need to support 310 00:49:29.640 --> 00:49:35.070 Bill Katz: And what are the file formats that we believe you know at least suitable for that. 311 00:49:35.700 --> 00:49:49.890 Bill Katz: Because it's not clear to me, for example of you going in five czar route, whether that's even suitable for example, doing sparse volumes, where you're just interested in a few neurons, you know, that are spanning massive amounts of spaces. 312 00:49:51.090 --> 00:49:56.250 Bill Katz: Or if you're doing meshes or other things that. So it really depends on the type of data. 313 00:49:57.360 --> 00:50:04.440 Josh Moore: So agreed. So just as a side note that we have had conversations with tile DB and actually just from aerosol 314 00:50:05.130 --> 00:50:22.020 Josh Moore: Go outside of having some base implement a SAS based format that everyone's using and then building their feet on top of it. So I would hope we could even work with tile dB. I know their, their implementation is a very strong one. So, okay, I'm make use of that and and tie everything together. 315 00:50:22.440 --> 00:50:35.100 Bill Katz: I guess I'll also put a disclaimer here that I'm starting to collaborate with the founder of tile DB on adding branch version to their system, which I think is natural, given the way that they handle fragments and stuff like that. 316 00:50:39.120 --> 00:50:40.560 Josh Moore: So I think I'm good. Go for it. 317 00:50:41.460 --> 00:50:56.160 Ola Tarkowska: But basically I definitely agree with you build that we use cases are crucial to define because I, as I said, on the chat. I found my needed optimization in the file format. 318 00:50:57.420 --> 00:51:09.450 Ola Tarkowska: needs to happen on the process level and it's up optimization is happening for each individual process. So whether it's visualization analysis registration or 319 00:51:10.080 --> 00:51:28.200 Ola Tarkowska: archiving, they require very different data. It's the same pixel data, but different formats are needed to process this the most efficient way. And if we dig more into those processes they they just require different solutions so 320 00:51:29.220 --> 00:51:35.670 Bill Katz: I think we can see that very easily to if you're supporting visualization Apple applications. Right. 321 00:51:35.970 --> 00:51:46.320 Bill Katz: Because the style that it's written in. You almost want it to be on GPU available. So like for example neural Glanzer 322 00:51:46.710 --> 00:51:58.290 Bill Katz: Has as part of their format that he designed a compression scheme for segmentation such that it can be easily loaded, and then you know be altered using shader code or whatnot. 323 00:51:59.580 --> 00:52:08.970 Bill Katz: And so I think that it's almost like in the database world. You can almost see this d normalization approach right that you'd like to have a normalized form of the data. 324 00:52:09.300 --> 00:52:18.150 Bill Katz: But it just is not suitable for a variety of use cases where you're going to want to D normalize the data so that it can be rapidly done and so 325 00:52:19.440 --> 00:52:26.340 Bill Katz: I think that's another aspect because like we're also seeing the the graphics card companies are starting to support things like direct storage. 326 00:52:26.730 --> 00:52:38.190 Bill Katz: Where you can move the data directly from the whatever store you have into the GPU. Um, and so, yeah, these are all things that are sort of part of it. 327 00:52:44.220 --> 00:52:44.730 Josh Moore: I think I missed 328 00:52:44.880 --> 00:52:45.810 Nicolas Chiaruttini: Some other hands. 329 00:52:45.870 --> 00:52:46.350 Josh Moore: Go for it. 330 00:52:46.860 --> 00:52:59.520 Nicolas Chiaruttini: If I can come on. Just quickly on the political side towards the Vandal, I would just like to mention that seeing the microscope each facility. I've also emerging facility. 331 00:53:00.150 --> 00:53:09.780 Nicolas Chiaruttini: They need to have a common workflow for all the Vandals to get something which is not 50 different process to to do the same segmentation. 332 00:53:10.200 --> 00:53:22.260 Nicolas Chiaruttini: And basically in the code for tender. You can we SPS explicitly specify, you want to have the bio format API supported in beauty resolution and so on. 333 00:53:23.220 --> 00:53:33.840 Nicolas Chiaruttini: And this is an argument also to push the Vandals to go to wild something unified so if there is a good specific age good specification, then you can push 334 00:53:34.740 --> 00:53:52.410 Nicolas Chiaruttini: For this other microscopy facility level. So we did this with the microscope. Unfortunately, we did not specify mercury resolution, then we end up with a better format API without mercy reservation and I'm pushing a lot of the people from later. 335 00:53:53.490 --> 00:54:01.710 Nicolas Chiaruttini: Not to mention them that I hope will work to watch this. But just to mention that at the microscope P facilitator, you cannot really 336 00:54:02.430 --> 00:54:12.960 Nicolas Chiaruttini: An impact because when you buy an equipment. If the Vandals see on the court photos and you need to support something at some point I simply they will do it. 337 00:54:20.910 --> 00:54:21.720 Jason Swedlow: A COUPLE COMMENTS. 338 00:54:22.920 --> 00:54:25.050 Jason Swedlow: So say a couple things just from the standpoint 339 00:54:25.050 --> 00:54:26.250 Jason Swedlow: Of our experience. 340 00:54:28.560 --> 00:54:38.220 Jason Swedlow: With respect to the vendors. I think Josh is correct. There's an awful lot of interest. It's fair to say Zeiss has been the most present, but we know 341 00:54:38.670 --> 00:54:46.020 Jason Swedlow: We're getting a lot of attention from Nikon and Olympus, as well as, like, as I've had a lot of conversations high level conversations with like 342 00:54:48.210 --> 00:54:50.850 Jason Swedlow: Probably just to summarize, an awful lot. 343 00:54:52.440 --> 00:54:56.490 Jason Swedlow: This is going to be this is going to be a long process to work through with them and it's going to be 344 00:54:56.940 --> 00:55:05.820 Jason Swedlow: I think just parroting Joshua, it's going to be a lot of talking, but it will be based on what they just said, which is, you know, very strong specifications and if those exist. 345 00:55:06.270 --> 00:55:16.710 Jason Swedlow: They will pay attention and they will adopt them proof of that is we've had anecdotally been anecdotally told several times from settlement manufacturers, they tell the customers to use file formats. 346 00:55:17.100 --> 00:55:22.770 Jason Swedlow: Right, so that's that's kind of an example of that. So that's, that's, that's the first thing that's possibly 347 00:55:23.340 --> 00:55:29.130 Jason Swedlow: A little bit pathologically optimistic. I'm usually accused of being that, that's my role in the project. But, you know, 348 00:55:29.940 --> 00:55:38.010 Jason Swedlow: That put all of that won't happen in the next week. I think the other thing we're seeing is a single format just won't do it and to pretend that 349 00:55:39.000 --> 00:55:41.790 Jason Swedlow: All of the use cases that we're talking about. We're covering 350 00:55:42.750 --> 00:55:54.960 Jason Swedlow: First of all, the modalities that we're talking about the kinds of visualization that bill is thinking about with respect to, you know, a whole brain and I will actually want to focus on two different neurons that are separated by millimeters 351 00:55:56.070 --> 00:56:06.600 Jason Swedlow: public repositories, etc. I mean there's just extremely, extremely different needs and use cases. So the pretense that oh, there will be a single thing out there, it will solve all the world's problems. 352 00:56:06.990 --> 00:56:18.840 Jason Swedlow: Is I think needs to be put to bed. I will tell you, however, that as I kind of slightly jokingly mentioned, you know, I end up having to pay the bills around here, at least for me. 353 00:56:19.620 --> 00:56:24.180 Jason Swedlow: I was on a two hour call with the thunder. And so if you say to them. 354 00:56:24.720 --> 00:56:36.780 Jason Swedlow: If they're if the call is about interoperability and data access and open data and they say, Wait, wait, wait, you're going to solve this by coming up with yet another file format. Actually, no, not just one probably four or five 355 00:56:37.290 --> 00:56:50.850 Jason Swedlow: You know, basically, understanding that is really super hard. And so that gets to be a hard, hard. So we've had some success with CGI. And we're very grateful for that. But the reality is we're going to have to 356 00:56:51.300 --> 00:57:05.880 Jason Swedlow: Work on how we present this actually very serious way not only to fundraise for to our institutions as well. Right. So, I mean, you know, selling this within the various institutions that you all work for is going to be a challenge. 357 00:57:07.680 --> 00:57:18.630 Jason Swedlow: But you know that's that's just part of it. And there was one final point I can't remember what it was. But yeah, I think there's just the reality. Many of the things that we're talking about here are 358 00:57:21.030 --> 00:57:31.740 Jason Swedlow: are reflective of the where we are sorry, the last point with respect to API. I've also the person who ends up having to fund bio formats. 359 00:57:33.690 --> 00:57:48.360 Jason Swedlow: We have a lot of experience in using bio formats and it are in the image data resource, you know, the idea of an API and just random formats underneath is is running out of steam it just it it really is too much to ask 360 00:57:50.370 --> 00:58:02.970 Jason Swedlow: With you know when when driven by the data with the complexity of the data coming in the sheer size is part of it. But part of it was all the other things that we're discussing so 361 00:58:04.320 --> 00:58:17.280 Jason Swedlow: I think any are experienced that. I think everybody on the team agrees with me, if not say so. But we need some kind of convergence at that at that at that storage level. 362 00:58:18.660 --> 00:58:35.310 Jason Swedlow: So that in in in partially responding to Davis's point, which I'm sure we all agree, is that the having infinite divergence at that level is is going is we're seeing that run out of steam so file formats basically can't keep up. 363 00:58:36.510 --> 00:58:45.570 Jason Swedlow: When you're working at scale, right, so that's that's the sheer reality, we see that all the time and it cetera. Sorry. So at several points, but 364 00:58:47.100 --> 00:58:56.010 Jason Swedlow: There will be multiple formats. I think I like, I think it was Josh said, you know, CAN WE NAIL down exactly what are the things that we're going to need 365 00:58:57.120 --> 00:59:05.220 Jason Swedlow: To satisfy this range of requirements so you know Bill's got one end, which is actually extremely important. I mean, that's a key biological problem. 366 00:59:05.700 --> 00:59:17.220 Jason Swedlow: With public data resources have another end have another end several other dimensions to think about there. But, you know, if we can nail those down. I think that would that would really help us decide a half before 367 00:59:20.070 --> 00:59:23.700 Bill Katz: So Jason, if I could comment on what you just said about 368 00:59:24.720 --> 00:59:31.980 Bill Katz: You know the explosion of the file formats. There was a and I probably should have posted the links on that there was 369 00:59:32.640 --> 00:59:41.640 Bill Katz: Two blog posts one by open ML about what would be the ideal data format for open machine learning data sets. 370 00:59:42.090 --> 00:59:58.350 Bill Katz: And then there was a response by the tile DB founder about why they should consider tile DB as a universal data format and it was primarily focused on. And I would say that the one argument, which I kind of agree was with Stavros on is that 371 01:00:00.180 --> 01:00:02.190 Bill Katz: The towel DB approach, which is 372 01:00:02.820 --> 01:00:14.730 Bill Katz: The focus should be on data engines and API's over file formats, because at least with data engines and my data engine. I just kind of mean that it could be a library, which could be like an embedded library. 373 01:00:15.060 --> 01:00:21.660 Bill Katz: Where you don't actually know what's going on. But you, you have an API, you can ask for things and it automatically figures out 374 01:00:21.960 --> 01:00:30.300 Bill Katz: How to get the information that you need, or it can be an HTTP API or some kind of service, which also gives you that ability to kind of like 375 01:00:31.080 --> 01:00:38.460 Bill Katz: You know, change everything on the back end. Fine. You can change the file formats, but your request on the client side won't change and 376 01:00:39.390 --> 01:00:57.030 Bill Katz: So I i think that that was the tile DB approaches that we should focus on that kind of effort and I'm curious because it almost seems like that might be an interesting you know place to start thinking about it file format versus data API embedded engine. 377 01:00:57.480 --> 01:01:09.570 Jason Swedlow: So just to say, I mean, looking at the, at least from the group that I know I think Melissa and Aaron had the most experience with Tyler DB. So, Glenn ko's file formats to raw converter already support style to be 378 01:01:12.390 --> 01:01:17.130 Jason Swedlow: misspeaking so Aaron's it's not so yes it is something we're definitely looking at 379 01:01:17.610 --> 01:01:17.850 Bill Katz: I mean, 380 01:01:18.390 --> 01:01:29.340 Bill Katz: The interesting aspect is that we're already kind of seeing this in that there's a number like for example the end five implementation apparently supports czar in the back end design, implementation in 381 01:01:30.090 --> 01:01:36.660 Bill Katz: Supports and five that you have tensor store now and what Jeremy's doing with neuro Glanzer is he supporting 382 01:01:37.020 --> 01:01:50.280 Bill Katz: My god, he's supporting David boss, you know, a couple of other connectomics APIs as well as pre computed as well as in five and czar. And so what we're why you know what we're seeing is that the 383 01:01:50.610 --> 01:01:59.010 Bill Katz: That the data engine, the embedded libraries are basically doing the job of like figuring out all the file formats. 384 01:01:59.100 --> 01:02:06.930 Josh Moore: We so we should probably. So this sounds like something we should do a breakout on and I mean I'm more than happy and take time and 385 01:02:07.680 --> 01:02:20.610 Josh Moore: I'm even in the next, you know, days or two. My experience has been when you start this eventually you always get back to the question of who's taking the responsibility that biopharma it's currently has it doesn't go away. 386 01:02:21.120 --> 01:02:32.070 Josh Moore: And currently we have it. We're trying to get rid of it. And it's going to fall on someone else if we don't solve this, if we're trying to be good about this right and go, you know, you want to get rid of this. No one should suffer the pain. 387 01:02:33.480 --> 01:02:42.690 Josh Moore: It always becomes as long as you have a number of implementations and I've worked heavily with the in five czar and the czar in five implementations and they have the same problems, they don't 388 01:02:43.170 --> 01:02:55.470 Josh Moore: It's all hard coded at the moment. And it's just barely working. So we need a better way, whether it's the engine or a format where we're definitely not there yet. Sorry, dumb year has really been struggling and then Catarina 389 01:02:57.150 --> 01:03:05.820 Damir Sudar: You're probably asking the same the same thing or proposing the same thing. And so this is pretty much by the 390 01:03:07.230 --> 01:03:29.280 Damir Sudar: Way the separation is to the carbon part of the storage format, whatever its file or API or whatever. And the, the implementation of white pixels. Go. So if we can think of moving as much of the thing that should be as common as possible into the meta data layer and as little as possible. 391 01:03:30.780 --> 01:03:43.620 Damir Sudar: In the implementation of where pixels go layer then potentially we we can move as much to a common, common API or file format as 392 01:03:44.220 --> 01:03:53.250 Damir Sudar: As we can achieve. And then it becomes maybe a little easier to just swap out that pixel storage layer underneath. 393 01:03:53.670 --> 01:04:01.980 Damir Sudar: When it's needed, or have support multiple ones if that's really how it needs to be. And clearly from the discussion. It needs to be that way. 394 01:04:02.550 --> 01:04:11.760 Damir Sudar: So, but push as much as possible into the metadata layer, which should be common and there should not be multiple versions off or multiple access points to 395 01:04:13.890 --> 01:04:13.980 That 396 01:04:23.370 --> 01:04:24.480 Caterina Strambio: Can I speak. 397 01:04:28.560 --> 01:04:29.490 Josh Moore: My mic was awesome. 398 01:04:31.020 --> 01:04:32.850 Caterina Strambio: Okay, now I just wanted to 399 01:04:35.280 --> 01:04:37.860 Caterina Strambio: Just a point of clarification for me. 400 01:04:39.360 --> 01:04:48.120 Caterina Strambio: Maybe others. So we're talking about, you know, the combination. The, the, kind of the dichotomy between API versus format. 401 01:04:50.070 --> 01:05:00.870 Caterina Strambio: The but I'm trying to understand. So if we would have multiple because we decided that there is not going to be a single new format. So it's going to be next generation file formats. 402 01:05:01.950 --> 01:05:04.260 Caterina Strambio: Then there will be API between those 403 01:05:05.460 --> 01:05:08.670 Caterina Strambio: So not. I mean, or not. Am I am is understanding something 404 01:05:11.400 --> 01:05:11.730 Josh Moore: Maybe 405 01:05:11.970 --> 01:05:16.920 Caterina Strambio: They will, they should be translation. I mean, if we decided there is that needs to be more than one 406 01:05:18.360 --> 01:05:29.550 Josh Moore: From our side. Yeah, they will. If there are multiple formats in the oil me space, you will have to be able to translate between them. So that's a given. So from the API side. 407 01:05:30.300 --> 01:05:39.360 Josh Moore: I don't know is does everyone feel comfortable that there's one clear API that represents all the needs and you don't need to map between multiple API's. I find that 408 01:05:39.720 --> 01:05:40.770 Josh Moore: Amazing but 409 01:05:40.950 --> 01:05:44.820 Davis Bennett: I think for dense image data that's reasonable. 410 01:05:46.170 --> 01:05:47.280 Davis Bennett: Like if you think of 411 01:05:47.640 --> 01:05:49.140 Davis Bennett: An image as being a 412 01:05:50.250 --> 01:06:00.120 Davis Bennett: Script of pixels. The job of something, some software that consumes the image is just to ask for some region of pixels and it gets the values back 413 01:06:00.840 --> 01:06:11.910 Davis Bennett: Then a separate query for metadata. I think that's abstract enough to cover any 9% of what people do what what applications do that consume dense images are stuff and other things I think is 414 01:06:12.060 --> 01:06:14.280 Josh Moore: More calm. Yeah. So I think that's an extra call right so 415 01:06:15.300 --> 01:06:17.430 Bill Katz: Yeah, I think that usually as 416 01:06:18.480 --> 01:06:27.990 Bill Katz: Multiple API's where one API is particularly for a particular use case. So for example, you might have an n dimensional array API. 417 01:06:28.440 --> 01:06:43.740 Bill Katz: Or a multi dimensional. I mean, sorry multi scale multi dimensional array API. And that serves you know most of what you might need for in the image format. But if you wanted to do, for example, like a log 418 01:06:45.060 --> 01:06:57.150 Bill Katz: API that might be a little bit different and the format of the data might be different, like if you just wanted to replay all the commands are all the issues. And so, but one that's one of the other things, too, is like 419 01:06:58.080 --> 01:07:08.430 Bill Katz: What is the characteristics of the data and how would you have an API. What is the minimal set of API is necessary for the data that we need in science. 420 01:07:15.660 --> 01:07:22.110 Ola Tarkowska: I think to answer your question is minimum API could be the one that describe data life cycle. 421 01:07:24.240 --> 01:07:45.570 Ola Tarkowska: Because we speak a lot of about data but data life cycle could be also taken today account where we actually that this defines the purpose of each workflow of the, you know, of the process, how data is transformed, if that makes sense. This is different approach. 422 01:07:48.540 --> 01:07:53.850 Bill Katz: I mean, to be TV more concrete about like systems that we have to to make on our side. 423 01:07:54.780 --> 01:08:03.180 Bill Katz: We have a different API and a dip in and supported by different data type for three dimensional annotations. So for example, if we need to do synapses. 424 01:08:03.480 --> 01:08:11.520 Bill Katz: Those are a series of 3D points that are sparse. They may have characteristics or properties associated with it, like they're associated with a label or 425 01:08:11.760 --> 01:08:23.910 Bill Katz: That they're considered pre synaptic post synaptic etc. So, so, but that's a very different type of data that may be extremely large like you could have 100 million of these data points is more like a point cloud. 426 01:08:24.240 --> 01:08:42.090 Bill Katz: kind of scenario. And then we have a different data type in a different API for our n dimensional dense stuff. We have a different API. If we wanted sparse volume representation. So for example, if you needed, like a neuron that stretched over large periods of, you know, space. 427 01:08:43.110 --> 01:08:51.540 Bill Katz: In each of these also in terms of like how you'd want to store it might be stored differently because you know they're there are different approaches. 428 01:08:56.460 --> 01:09:05.190 Davis Bennett: I might want to make might be a historical point. But I think what's unique about the types of data that we work with thinking about microscopy data. 429 01:09:05.850 --> 01:09:14.220 Davis Bennett: Is that I don't think private industry has needed to solve the problem of sharing and interrogating enormous by Crosby data sets. 430 01:09:14.940 --> 01:09:25.260 Davis Bennett: So unlike a situation like a database for records database for transactions think these things, these things exist in industry good technical solutions there. 431 01:09:25.890 --> 01:09:35.250 Davis Bennett: But I feel like really large images that come off the microscopes. We use there's there's no precedent, I think. So, we will have to come up with something 432 01:09:36.270 --> 01:09:42.960 Davis Bennett: On our own for other things like the point clouds, maybe best solved with tools that already exist that have been solved in other domains. 433 01:09:47.430 --> 01:09:58.470 Mark Kittisopikul: Oh, let's say I was funny. Maybe try to verify something about API is is I think many of the APS we work with are also bound to a specific program language. 434 01:09:59.040 --> 01:10:11.160 Mark Kittisopikul: And something Davis and I have talked about in the past is maybe moving towards more of a a service model where maybe something like a web service, even where you're more speaking protocol. 435 01:10:12.510 --> 01:10:14.310 Mark Kittisopikul: Or Can think of something like a mirror. 436 01:10:15.330 --> 01:10:21.090 Mark Kittisopikul: So you have a server that's a couple actually parse whatever format you want and provide you the 437 01:10:22.590 --> 01:10:23.370 Mark Kittisopikul: pixel data. 438 01:10:25.350 --> 01:10:33.840 Mark Kittisopikul: And so that's that's another thing. The other thing I've noticed and talking to some manufacturers is there's some willingness to support 439 01:10:35.310 --> 01:10:40.770 Mark Kittisopikul: Support the users if they could be provided some kind of interface. 440 01:10:42.000 --> 01:10:51.360 Mark Kittisopikul: So for example, we were talking to Karen manufacturer about maybe using kind of block based format. And they said, Well, you know, just send us a DLL 441 01:10:52.410 --> 01:10:55.710 Mark Kittisopikul: And will we, you know, we'd be willing to plug into that. 442 01:10:57.270 --> 01:11:00.360 Mark Kittisopikul: And so I think there's there's someone has to maybe think of 443 01:11:02.100 --> 01:11:09.300 Mark Kittisopikul: Some kind of modular ecosystem where, you know, we can define 444 01:11:10.320 --> 01:11:18.690 Mark Kittisopikul: Maybe abstractly, the interface of there might be some willingness to build modules that connect to those interfaces. 445 01:11:21.240 --> 01:11:36.390 Mark Kittisopikul: And I guess one other point about API's. So I think that, you know, there are some, there's some of the things that we need to be aware of in terms of the legal aspects of using API is is currently working as a way through the US Supreme Court. 446 01:11:37.710 --> 01:11:50.910 Mark Kittisopikul: Specifically kind of with the Java or Google or support local matter that we need to be cautious about and making sure that we do provide these interfaces that we make sure that they are free to use 447 01:11:54.060 --> 01:11:54.420 Josh Moore: Dummy 448 01:11:59.880 --> 01:12:02.700 Damir Sudar: Sorry, I forgot to lower my hand. Sorry. Okay. 449 01:12:09.990 --> 01:12:11.130 Josh Moore: So wherever we ended up 450 01:12:16.710 --> 01:12:26.490 Josh Moore: Sounds like we certainly need so from my side. I mean, it probably means an investigation and maybe a learning experience of the API's that exists. So if anyone wants to, you know, 451 01:12:27.720 --> 01:12:40.620 Josh Moore: Volunteer to either teach me directly, or maybe set up a little lightning talk and walk us through it. That would actually be really valuable and would help me understand, are we actually seeing the same thing. And we're just seeing it in different ways, or 452 01:12:41.820 --> 01:12:55.200 Josh Moore: Are we in two completely different worlds or I think the most likely, and I think the most optimal. You know, is it, is it a subset, and a superset that we can actually work together quite nicely. That'd be my hope I'm 453 01:12:58.920 --> 01:12:59.760 Josh Moore: So that's fair. 454 01:13:02.640 --> 01:13:10.560 Josh Moore: We still have a couple more topics that we have about 45 minutes so you wouldn't want to Buddhist to something completely different. So we try to make 455 01:13:11.550 --> 01:13:25.980 Josh Moore: kind of wake up. I actually was warned that in the previous session, we should we should have built in a break. About half. So the next time we have a call. I'll build into the schedule. So at least a few minutes to get away and get something to drink. 456 01:13:29.910 --> 01:13:41.910 Josh Moore: Mean next on the list of most people who are interested was actually just talking about the pyramids in the multi scale representation. Is that still valid. Does anyone have any particular. They want to know. I want to share about those 457 01:13:44.340 --> 01:13:49.050 Davis Bennett: I just want to say I'm excited about the army specification 458 01:13:50.340 --> 01:14:04.770 Davis Bennett: Cool one I am thinking about asking the Neuromancer developer to support that, because I don't. He has implemented any pyramid specification for czar containers, maybe Eric Perlman can speak to that. 459 01:14:07.110 --> 01:14:17.010 Eric Perlman: I can speak to that, that he's support Jeremy has support for Saul films original sort of hacky and five scale down samples. 460 01:14:17.250 --> 01:14:27.360 Eric Perlman: Like whatever it was. That's all felt that additionally with the FBI data judge john is very well aware of. He supports that. And so one of the things I've talked to Josh more about is it would be straightforward to 461 01:14:28.050 --> 01:14:39.480 Eric Perlman: Add a dialect of that using for the Tsar data source. I mean, basically, we just copy and paste the five as a sample. The bigger issue of neuronal answer is actually that he currently requires 462 01:14:41.040 --> 01:14:52.680 Eric Perlman: All the dimensional dimensions be in the same chunk. This is not an issue if you're doing DM data, but to be an RGB data currently at the moment in the LM a world like a bio formats to raw 463 01:14:53.430 --> 01:14:57.210 Eric Perlman: RG and be an alpha are going to be four distinct chunks. 464 01:14:57.480 --> 01:15:07.410 Eric Perlman: And so when I have done this, I have to write to disk and then reformat it to the right size but Jeremy's. I mean, it's one of those things where, unfortunately, Jeremy is the only one in the universe, who could 465 01:15:07.740 --> 01:15:13.140 Eric Perlman: Fix your answer to support that so well. So anyway, that's probably something to ask him. 466 01:15:16.350 --> 01:15:26.160 John Bogovic: I'm also just I'd like to say that I'm excited and I hope we agree on what we do about pyramids in the future, but that's essentially all I'm okay. If we move ahead. 467 01:15:27.150 --> 01:15:40.920 Eric Perlman: Yeah, I mean the pyramids basically been two dialects. There was the model of for each resolution you say what the new resolution is versus the other dialect, which is for each resolution you say how much it was down sampled 468 01:15:42.030 --> 01:15:45.900 Eric Perlman: I mean, they're fine functionally the same but it's been kind of those have been the two camps. 469 01:15:45.960 --> 01:16:00.960 Josh Moore: You just kind of have to make a decision at some point. So for anyone who wasn't part of it. So there was actually a quite protracted conversation on GitHub. I mean, if you follow the links from image SC, you can get back there. So all of that's archived right for posterity. Um, 470 01:16:02.970 --> 01:16:09.840 Josh Moore: So from my side. Yeah, it was, it was good to just get everyone to agree, make a decision, say this is what we're going to do going forward. 471 01:16:10.410 --> 01:16:22.800 Josh Moore: Um, the more pieces of software that we can get writing reading and writing the same format, the better. So if anyone's blocked in any way on making that happen. Speak up. And let's try to tackle that 472 01:16:23.100 --> 01:16:30.780 Josh Moore: Sounds like we need time from Jeremy. So, okay, I don't know what what has to happen for that to occur. Maybe its funding. I don't know. 473 01:16:31.500 --> 01:16:37.110 Josh Moore: I'm from my side. I do know that and one was talking about that this morning. 474 01:16:37.590 --> 01:16:54.180 Josh Moore: You know there's we we punted on the scale factor. The scale origin and offset metadata from the multi scale specification. I assume that's blocking some people and so sorry john I'm so whoever would like to, you know, 475 01:16:55.830 --> 01:17:02.490 Josh Moore: I have a couple of things more things on my plate. But if someone wants to go. Okay. You know, I have an opinion. This is how we should do it basically 476 01:17:02.850 --> 01:17:10.380 Josh Moore: Bringing back that conversation from the Tsar spec repository that would be a great way to make it happen. So basically we need 477 01:17:11.250 --> 01:17:19.020 Josh Moore: To run the conversation a little bit, get the consensus, again, I think the implementation is easy. You know, it's not really a problem. Once you've made the decision. 478 01:17:19.560 --> 01:17:28.950 Josh Moore: And then we go from there. So maybe we can get that done during 2020 that might be a nice way to kind of top off the year, um, 479 01:17:29.820 --> 01:17:30.510 Josh Moore: Nico, you're 480 01:17:30.660 --> 01:17:34.230 Josh Moore: Raising your oh actually several hands Nico then Trevor then Jamie, sorry. 481 01:17:36.120 --> 01:17:41.820 Nicolas Chiaruttini: Maybe I missed what you're talking about the description retailing the specification that happened somewhere on GitHub. 482 01:17:41.910 --> 01:17:42.420 Right. 483 01:17:44.160 --> 01:17:49.950 Nicolas Chiaruttini: Okay, so then, yeah, I just put a link, because then I would be interested to see 484 01:17:51.090 --> 01:18:00.510 Nicolas Chiaruttini: Regarding mercury resolution. So I don't know, I just use the bio formats API, which is working fine. And I did a bridge with a big data mover. 485 01:18:00.960 --> 01:18:11.820 Nicolas Chiaruttini: So, which is like an iframe transformed from different scale what I was wondering if you discuss this matter is when you do reduction of show pictures towards one hour block. 486 01:18:12.960 --> 01:18:15.840 Nicolas Chiaruttini: Usually you compute the average of the pizza. 487 01:18:17.070 --> 01:18:26.280 Nicolas Chiaruttini: So then you get the average of eight pixels or whatever. But then I was wondering whether there was a possibility or it needs to be specified 488 01:18:26.670 --> 01:18:37.770 Nicolas Chiaruttini: Whether you want to do rejection with the maximum value. For instance, or with a minimum value or with something different. So meaning specifying the process by which you're reducing 489 01:18:38.220 --> 01:18:44.430 Nicolas Chiaruttini: The scale of your object because I found that it's an issue perform on his average 490 01:18:44.880 --> 01:19:02.070 Nicolas Chiaruttini: Then your structure when you downscale lot then basically you have Reggie zero but issue. Do a rejection with maximum value then you end up with a sort of maximum projection in three. So I don't know if I'm clear 491 01:19:05.400 --> 01:19:07.200 Josh Moore: I think you're beyond me, but I think 492 01:19:08.310 --> 01:19:10.230 Josh Moore: We should try this, you know, is 493 01:19:10.440 --> 01:19:13.470 Josh Moore: There something that you want to see in the spec, like there's a way, you know, 494 01:19:13.650 --> 01:19:16.650 Josh Moore: Depending on what algorithms being used to do the down sampling 495 01:19:16.860 --> 01:19:18.780 Nicolas Chiaruttini: If you want to store what you did. 496 01:19:19.230 --> 01:19:24.900 Josh Moore: We just needed. We needed example basically of what you would like to capture and then let's try to work it into the spec so 497 01:19:26.070 --> 01:19:34.350 Eric Perlman: That would be a great thing in the spec. I mean, I can think of averages max and then also there's the majority down sampling for label data masks. 498 01:19:35.580 --> 01:19:46.050 Eric Perlman: So yeah, I think basically that should be added to that conversation on get how that that's something that should be in the spec. It's probably again one of those things which most people will not use most people will be doing 499 01:19:46.350 --> 01:19:49.440 Eric Perlman: Average down sampling, but it's good to have that meta data stored 500 01:19:50.400 --> 01:20:03.600 Nicolas Chiaruttini: For 3G display, then the maximum taking the maximum pixel out of a chunk can be can be useful for 3D representation of downscaling it, but I can, yeah. 501 01:20:05.070 --> 01:20:11.670 Eric Perlman: I completely agree. By world lately has been series of two dimensional down sampling. So it's not as important, but I totally agree. 502 01:20:12.390 --> 01:20:14.010 Josh Moore: Okay, I think we're on Trevor. And then, Jamie. 503 01:20:16.290 --> 01:20:22.740 Trevor Manz (he/him/his): All right, and bring it the brilliant, but not quite on multi scale anymore, but I have them a little bit of 504 01:20:24.300 --> 01:20:31.440 Trevor Manz (he/him/his): There was a little bit of discussion about, you know, potentially storage types, not being so much like thinking more in terms of data API's rather than format. 505 01:20:32.040 --> 01:20:44.610 Trevor Manz (he/him/his): And I have done a little bit of experimenting with that of using sort of dar as a as a data API layer, rather than and expose like basically using a small micro service. 506 01:20:45.090 --> 01:21:03.720 Trevor Manz (he/him/his): To translate chunks from other format to czar so that our web applications can do it. And I guess my comment is just that there are possibilities to have sort of a data API that mirrors some native format. But also, you know, have all these different types of binary containers as well. 507 01:21:04.950 --> 01:21:11.520 Trevor Manz (he/him/his): But I have done a little bit experimenting with that. I just wanted to comment if anyone wanted to follow up or any question. 508 01:21:13.050 --> 01:21:15.270 Josh Moore: Yeah, I have great hope and what Trevor's been doing so. 509 01:21:17.250 --> 01:21:20.550 Josh Moore: Very cool to make everything just turned into a czar. Go for it. Jamie 510 01:21:21.810 --> 01:21:25.920 Jamie Sherman: Okay, just to two questions, or one question. One comment. 511 01:21:26.940 --> 01:21:30.000 Jamie Sherman: So one is with the pure with a different pyramid data. 512 01:21:31.590 --> 01:21:49.740 Jamie Sherman: Is the general practice when you're pulling from a raw from an instrument manufacturer format to just mere over there are other parents like I'm working. I work on one library that does this. And at the moment I ignore anything that's not pyramid zero and so like 513 01:21:50.760 --> 01:21:55.950 Jamie Sherman: When you're mapping to a common format is this is the general practice to just say, Okay, I'm going to take 514 01:21:56.400 --> 01:22:04.830 Jamie Sherman: Pyramid zero and document how I transform it to a different pyramid level or is it to take the manufacturers image format. 515 01:22:05.670 --> 01:22:16.170 Jamie Sherman: As like it's whatever other pyramid levels that they've encoded and pull them across and put them in the data because I'm a little concerned in that case that you then end up with this like 516 01:22:16.830 --> 01:22:23.820 Jamie Sherman: Yeah, I got this out of something that uses a proprietary like where the information on how it was generated isn't necessarily shared 517 01:22:25.950 --> 01:22:29.310 Jamie Sherman: And so that's that's one kind of common question. 518 01:22:30.570 --> 01:22:37.320 Jamie Sherman: And then the other one is, and this is just because it's come up in this documenting 519 01:22:37.980 --> 01:22:56.940 Jamie Sherman: Down sampling and that kind of thing. I know that there are people at the Institute. I work on that are working on essentially ML transforms where you take lower resolution data and you try and up transform the resolution using an ML algorithm or something like that to 520 01:22:58.050 --> 01:23:15.720 Jamie Sherman: Their pros and cons etc of this kind of methodology, but being able to like link to the method or encode that this isn't actually the raw data or that it's a transformed form of data that would be helpful later it's something to be aware of and think of at least 521 01:23:19.530 --> 01:23:21.510 Josh Moore: Melissa, you want to take the bio formats to we're all 522 01:23:21.540 --> 01:23:22.290 Jamie Sherman: Part of this 523 01:23:23.220 --> 01:23:33.000 Melissa Linkert: Sure. So I'm to your question about whether we, you know, transfer over the any proprietary pyramid that was in the original file, we do not at the moment. 524 01:23:33.450 --> 01:23:44.160 Melissa Linkert: We only take the largest resolution and then it's down sampled from there for all of the reasons that you mentioned, you know, we can't really rely on 525 01:23:44.970 --> 01:23:58.560 Melissa Linkert: How that pyramid was generated. It may not be consistently down sampled across resolutions. So you may have different sizes it between different pyramid levels. It may not be down sampled 526 01:24:00.210 --> 01:24:11.670 Melissa Linkert: You know, at the skill that you would like to see in your final output file. There's just there are all sorts of reasons why it's unreliable to do that. So we just throw everything except the largest resolution away. 527 01:24:13.380 --> 01:24:20.550 Melissa Linkert: We do in both formats to raw document or include in the metadata. The down sampling algorithm that was chosen 528 01:24:21.000 --> 01:24:36.360 Melissa Linkert: There's a few different options for the most part, were using open CV to kind of do the heavy lifting for all of that but whatever algorithm you choose. That's going to be recorded in the metadata that winds up in the bizarre or and five container there. 529 01:24:38.280 --> 01:24:38.610 Melissa Linkert: Excellent. 530 01:24:38.700 --> 01:24:39.090 Melissa Linkert: Thank you. 531 01:24:40.650 --> 01:24:42.120 Josh Moore: And then, in general, I think, you know, 532 01:24:43.140 --> 01:24:48.240 Josh Moore: Several of you have touched on all kinds of things that could go into the metadata. Eventually, um, 533 01:24:49.830 --> 01:25:00.180 Josh Moore: We're currently trying to keep things as simple as possible. You know, so we're not going to try to specify upfront. You know how to store the ML model that you're using to do the down sample or the up sampling 534 01:25:00.870 --> 01:25:10.170 Josh Moore: Um, but I think those are interesting conversation. So from our side, we will try to get an infrastructure in place that allows the metadata to be written down. 535 01:25:11.460 --> 01:25:24.660 Josh Moore: In these formats and access via these API's and then and then all hell's going to break loose right but it's going to be fun. And that's what we need to do. So we'll get there. Davis I think you raised your hand. 536 01:25:27.960 --> 01:25:28.590 Josh Moore: Oh, you're muted. 537 01:25:29.460 --> 01:25:40.320 Davis Bennett: So I, I'd like to offer a contrarian perspective and maybe thinking pessimistically about putting the mechanism by which the data was generated in the multi resolution metadata. 538 01:25:40.890 --> 01:25:47.730 Davis Bennett: Like I it's not what the specific use case is for knowing that it was the mode or the mean and then I worry about 539 01:25:48.600 --> 01:25:58.470 Davis Bennett: The possibility of arbitrary. I mean, you can imagine arbitrary function that eight pixels and generates one pixel out of that. And so there will need 540 01:25:59.100 --> 01:26:07.170 Davis Bennett: Touches for this metadata field if somebody can't generate a string that someone else understand that they use to generate the down sampling 541 01:26:07.620 --> 01:26:17.130 Davis Bennett: So my philosophically my perspective has been a multi resolution pyramid literally just is a collection of images that should be able to stand on their own. 542 01:26:17.730 --> 01:26:24.330 Davis Bennett: And it's some application somewhere that thinks this thing forms a pyramid. But for me, the guiding principle was in the metadata. 543 01:26:25.050 --> 01:26:36.630 Davis Bennett: metadata that describes that image where it is in space and nothing else or arbitrary that could apply to any image. So that's, that's just me being contrarian, though. Yeah. 544 01:26:36.930 --> 01:26:47.760 Josh Moore: I think you're expressing what I'm trying to get out as well as I don't want to be the one responsible for it. I don't have a problem, putting a place where people can fill this out, you know, put something in there. But will the community know how to interpret it. 545 01:26:49.260 --> 01:26:50.340 Josh Moore: I guess they have to decide. 546 01:26:53.820 --> 01:26:57.930 Josh Moore: Okay, any other general theme of feelings on the multi scale representation 547 01:27:00.030 --> 01:27:07.230 Josh Moore: Sounds like we've somewhat segue into metadata. There are a couple people who have their names on it. Any thanks john 548 01:27:08.430 --> 01:27:10.740 Josh Moore: Any immediate thoughts Nico, you are first 549 01:27:12.780 --> 01:27:14.850 Josh Moore: The spline worked 550 01:27:15.000 --> 01:27:19.950 Nicolas Chiaruttini: On. So you just because we have this very big, huge images. 551 01:27:21.240 --> 01:27:30.960 Nicolas Chiaruttini: We don't want to receive them when they are transform and my application is like brand slice registration. And so I want to basically have a mercury resolution image. 552 01:27:32.160 --> 01:27:51.780 Nicolas Chiaruttini: I'm doing nothing and transform spines transform using your John's a big walk thing. And I just want to store the transformation. And so we can be like a few long points, then it's a very concise representation of a transform and I don't know if it has to be supported or not. 553 01:27:53.250 --> 01:27:57.840 Nicolas Chiaruttini: I think I find transform is very important to get the localization in in 3D. 554 01:27:59.040 --> 01:28:07.080 Nicolas Chiaruttini: That would come and add, I think that that should come with a time at which it was taken also time and space. That makes sense. 555 01:28:09.300 --> 01:28:19.290 Nicolas Chiaruttini: But then on top of transform. I was wondering whether we want to support sort of stoned out spline deformation meta data. 556 01:28:25.500 --> 01:28:27.750 Josh Moore: Is your hands up. Davis, or that up again. 557 01:28:27.870 --> 01:28:28.260 Damir Sudar: Go for 558 01:28:28.920 --> 01:28:35.520 Davis Bennett: Yeah, I would once again the the space of possible image creation is so huge. 559 01:28:37.170 --> 01:28:43.380 Davis Bennett: Very that if suppose you do support field metadata field for this specific type of image registration. 560 01:28:45.060 --> 01:28:58.830 Davis Bennett: Consider what fraction of images would field used. And then how many other registration modalities. There are like in the store a warp field. If you've done like a full, you know, a full nonlinear registration. 561 01:29:00.120 --> 01:29:09.660 Davis Bennett: I feel like image registration is like, it's like the medieval map where there's the ocean dragons, like the prospect of supply anything more than linear methods. 562 01:29:10.080 --> 01:29:18.600 Davis Bennett: That apply to the whole image makes me nervous. But maybe if there's something that 99% of images would would use it and the pain of not having in the metadata would be acute. 563 01:29:19.680 --> 01:29:22.680 Davis Bennett: Persuaded in the other direction. That's just my feeling 564 01:29:26.130 --> 01:29:26.490 Josh Moore: John 565 01:29:29.310 --> 01:29:31.380 John Bogovic: unmute. Okay. Yeah, so I think 566 01:29:32.730 --> 01:29:40.830 John Bogovic: I appreciate that we shouldn't define exhaustively all of the metadata that will ever be used like 567 01:29:42.150 --> 01:29:53.610 John Bogovic: Josh or me or no one should be responsible for everything. But whatever format, whatever we decide, should be flexible enough that one can include their own metadata. 568 01:29:54.060 --> 01:30:07.830 John Bogovic: For their own specific needs, because if we don't people like Nico or me or somebody else are going to need or want something and they're going to leave for another format or write a new format with this metadata. 569 01:30:08.610 --> 01:30:14.280 John Bogovic: For their particular purpose. So not every tool has to understand how to use 570 01:30:15.120 --> 01:30:23.010 John Bogovic: The spline that Nico writes into there, but they will be able to at least get the data out whatever data are there, they will be able to get out. 571 01:30:23.760 --> 01:30:36.090 John Bogovic: That is the data that are common. So I'm just, I like to strongly advocate for being as flexible for custom fields with custom content that not every tool. 572 01:30:36.690 --> 01:30:49.710 John Bogovic: Has to be able to read or write or understand whatever Nico or me or whoever writes into that should be documented by the user or creator of the tools that consumed that 573 01:30:51.240 --> 01:31:02.820 John Bogovic: But, and the fact is that if it's if it's written in a common way it will make it easier for other tools to potentially get it out like if it's much easier if say 574 01:31:03.540 --> 01:31:13.380 John Bogovic: Someone in Python wants to use something that Nico writes it will be easier for them to get at the data if they use the metadata of 575 01:31:15.060 --> 01:31:24.000 John Bogovic: Formats FOR THE WAY TO STORE AND READ A meta data, then it would be if Nicole wrote his own format. And that's one of the main reasons I think we should be flexible. 576 01:31:25.140 --> 01:31:32.100 Ulrike Boehm: So so john I completely agree with with like be having the flexibility. But one thing we should all also 577 01:31:32.730 --> 01:31:41.580 Ulrike Boehm: All agree on is like really having a proper documentation that people really in casing need something new, they want to work with something else that everything's really 578 01:31:42.450 --> 01:31:55.080 Ulrike Boehm: Listed in such a way that you can find your way around. And that I think is, I mean, independent of whether there's a diversity or so or as like diversity, it should should be so that people 579 01:31:55.650 --> 01:32:05.490 Ulrike Boehm: I mean that there's no people don't become desperate or so, and they should be also clear what's written in a documentation should because sometimes people define that whatever right like 580 01:32:06.210 --> 01:32:11.850 Ulrike Boehm: But it should be so that people can really also make use of that document, but just as a my two cents. 581 01:32:13.860 --> 01:32:15.390 Bill Katz: Here. It kind of reminds me 582 01:32:16.230 --> 01:32:20.280 Damir Sudar: I think I our next sorry so 583 01:32:21.330 --> 01:32:22.200 Damir Sudar: This is again a 584 01:32:24.810 --> 01:32:30.330 Damir Sudar: Push for the the parallel metadata storage storage discussion that's 585 01:32:30.990 --> 01:32:42.690 Damir Sudar: A nice as leading that in there. What we're trying to do is to have a basic set of metadata fields predefined and actually manage those, and have we really 586 01:32:43.320 --> 01:32:55.800 Damir Sudar: solidly defined and documented as they can to set, but then there is the possibility of extensions. Right. So the idea what Nico needs. Maybe it's just an A fine transform store. 587 01:32:56.310 --> 01:33:09.600 Damir Sudar: A lot of people need Justin I find transform that can be part of the the base metadata definition. But then if you indeed need some weirdo spline transformation that isn't captured 588 01:33:10.350 --> 01:33:20.700 Damir Sudar: Or isn't easily captured in a standardized way the extension mechanism within the metadata environment allows you to store your 589 01:33:21.270 --> 01:33:36.480 Damir Sudar: weirdo registration thing. So, so the idea is to have both, but to be never gets into the point where people say, man, it, it won't help me, and so I'm writing something completely different. So I'm completely on board with with what john just said. 590 01:33:37.560 --> 01:33:38.580 Damir Sudar: So that's it. Thanks. 591 01:33:41.100 --> 01:33:47.280 Jamie Sherman: Jay, okay, so like one like Davis's comment really made me think, and I 592 01:33:48.390 --> 01:33:58.950 Jamie Sherman: Think he's got a really good point. And one of the things that might be key to it is like having a binary fly essentially equivalent of a binary flag that says this is actual raw data. 593 01:33:59.550 --> 01:34:04.230 Jamie Sherman: And everything else is an interpolation interpreted data level. 594 01:34:04.860 --> 01:34:15.660 Jamie Sherman: Would be incredibly useful because then if you're an a naive user, you can just say, all right, I don't have time to evaluate what these other layers are or to derive how they got there. 595 01:34:16.200 --> 01:34:23.850 Jamie Sherman: But I just want to pull all this data in some model or something like that. And I want to know that it's real data without artifacts, then 596 01:34:24.390 --> 01:34:36.900 Jamie Sherman: You can you can find that easily in a trivial way and then obviously having, having the ability to add detail is great, but one of the things that does come to mind is like 597 01:34:37.440 --> 01:34:46.320 Jamie Sherman: How much of this do you want to be like, how much of an open format is cataloging what's been done to acquire the data. So it can be shared. 598 01:34:46.800 --> 01:35:00.150 Jamie Sherman: And how much of it is a lab notebook that's trying to capture everything that ever happened to the file which is kind of becomes a rabbit hole that can go on forever. It's that's just definitely a concern, anyway. 599 01:35:01.230 --> 01:35:07.770 Josh Moore: Just from my side, I would say, we would just put, you know, we'll just give the user paper and then from that point, they have to figure out what they're writing down so 600 01:35:09.600 --> 01:35:10.590 Josh Moore: It was back to Davis. 601 01:35:13.830 --> 01:35:16.170 Davis Bennett: Oh, I, I took my hand down, but I can 602 01:35:18.270 --> 01:35:19.410 Josh Moore: Bill is waiting as well. 603 01:35:19.650 --> 01:35:27.900 Davis Bennett: Sorry, it's just that I was just wondering, is there a way, if you're like a to have metadata. That's like for these extensions. Is there a standard not 604 01:35:28.620 --> 01:35:38.040 Davis Bennett: A specific tool like say neuro Glanzer or big war or whatever to have its metadata registered somehow. What's the state of the art for that. 605 01:35:39.210 --> 01:35:46.470 Josh Moore: So I have an opinion here. I mean, and this is basically part of what will come up in Catarina is call on the 20th, um, 606 01:35:48.990 --> 01:35:51.090 Josh Moore: How far do I want to go in this so 607 01:35:52.470 --> 01:36:00.360 Josh Moore: The experience with me XML has been that XML SSD doesn't or sorry yeah SSD. 608 01:36:00.990 --> 01:36:08.730 Josh Moore: doesn't provide what we need for the ability to extend so basically Catarina went through this huge process wrote a full extension of only the XML. 609 01:36:09.120 --> 01:36:15.990 Josh Moore: And the cost to everyone would have been so high, to actually support it. Right. I mean, it's basically this huge breaking change. 610 01:36:16.770 --> 01:36:30.810 Josh Moore: So my feeling is that JSON LD is the path forward and I have some work to do to make that proposal and show it to everyone. So that's on me and hopefully the person we hired to do this, um, 611 01:36:32.850 --> 01:36:37.830 Josh Moore: That's so the the major competitor to that would be JSON schema. I'm pretty 612 01:36:39.030 --> 01:36:46.560 Josh Moore: skeptical of JSON schema. At this point, after having spent over a year on the human cell Atlas project trying to use JSON schema for this. 613 01:36:46.800 --> 01:36:54.810 Josh Moore: Basically you have all the downsides of XML schema and none of the support that was built up by the W three see over you know decades. 614 01:36:55.800 --> 01:37:11.100 Josh Moore: The RDF our JSON LD community is WS three standardized does have a lot of support. I don't think the supports. Great. The other major downside is it's possible to write things that are very difficult 615 01:37:12.030 --> 01:37:19.260 Josh Moore: To understand so i think i. So, my ideal would be if we could find a way to have some subset of JSON LD. 616 01:37:19.620 --> 01:37:28.860 Josh Moore: I mean, we already want to put JSON into czar right so that's kind of pretty straightforward and then say this is the subset of JSON LD that we're going to support 617 01:37:29.580 --> 01:37:36.270 Josh Moore: One of the key values of it is its extensive ability. It's an open world model, you can always talk about anything. So basically, 618 01:37:36.690 --> 01:37:47.280 Josh Moore: From as the point of oil me currently. So as you know, as it when we we we are the the gatekeepers. We keep people out of the specification 619 01:37:47.760 --> 01:37:51.720 Josh Moore: By force, you know, we just, we can't let anyone write whatever they want. 620 01:37:52.560 --> 01:37:59.790 Josh Moore: The open world model is exactly the opposite. Anyone can say anything they want. So I think that gets us exactly the sensibility that we need 621 01:38:00.540 --> 01:38:15.150 Josh Moore: Um, will end in ultimate chaos. I hope not. I'm that's certainly my word bill on this, let Katarina speak since I talked about her and then get back to you. And then we're slowly need to start wrapping up 622 01:38:17.610 --> 01:38:30.480 Caterina Strambio: Yeah. So I mean, yeah, so there is going to be follow up on the discussion on on the 20. Well, I think it's going to be on the 20th of November, but we still have to finalize any way you can look it up on images see 623 01:38:31.890 --> 01:38:49.440 Caterina Strambio: But, um, yeah. So, I mean, I think that method I didn't jump in because the meta data problem is like there is multiple things that are defined meta data and the one that we have worked on. As for the end initially for the nuclear munition in then Bina, and now support of quiet. 624 01:38:50.580 --> 01:39:00.360 Caterina Strambio: With with the mirror and Enrique and other many other and several other people is the microscopy. So hardware settings quality control part 625 01:39:01.980 --> 01:39:08.610 Caterina Strambio: And the reason why I mean there was a need for the community to 626 01:39:09.870 --> 01:39:20.820 Caterina Strambio: chime in on what is currently being represented in the 2016 version. And we used the access D because that was the that time. 627 01:39:21.870 --> 01:39:33.420 Caterina Strambio: The what was how the mother was represented and but obviously these all along. We've been very interested in moving past that. So, 628 01:39:35.610 --> 01:39:43.650 Caterina Strambio: First, I mean, it is also one of the topics that we will discuss is, you know, is it necessary to have multiple representation. And how do we make sure that they keep 629 01:39:45.870 --> 01:39:48.540 Caterina Strambio: They keep, you know, align with each other. 630 01:39:50.730 --> 01:39:57.660 Caterina Strambio: Anyway, so, um, I don't want to take too much time in in this space. I mean, we can talk about it more 631 01:39:58.770 --> 01:39:59.130 Later. 632 01:40:00.660 --> 01:40:01.890 Caterina Strambio: But anyway, um, 633 01:40:04.020 --> 01:40:15.930 Caterina Strambio: I think that the representation of the meta data is a little bit of a convergence between a couple of different world. One is, what do we want to represent. And so, for example, the community that microscopy community. 634 01:40:16.290 --> 01:40:26.520 Caterina Strambio: Has to chime in on what needs to be represented for the fried defining the hardware and the settings and the quality control versus how do we are presented. And so, you know, 635 01:40:27.480 --> 01:40:44.850 Caterina Strambio: Having different presentation has been necessary for our work because different people. Not everybody can look into a access the file or even in the JSON LD file. So we need a way of for people to interact with the content and 636 01:40:46.290 --> 01:40:46.500 Yeah. 637 01:40:48.030 --> 01:40:48.300 Josh Moore: Okay. 638 01:40:48.330 --> 01:40:52.170 Caterina Strambio: Oh, absolutely, that that has to be human readable. 639 01:40:52.860 --> 01:40:58.500 Josh Moore: So I'll add the link to cut arenas image SC post. And then below that you wanted to say. 640 01:40:59.760 --> 01:41:11.880 Bill Katz: I just had a question, mostly because it was prompted by what Nicholas was saying where, you know, there was this basically custom transformation this, you know, and 641 01:41:12.480 --> 01:41:24.990 Bill Katz: Sort of tying it into compression as well, that if you have an extension or some sort of customized way of, you know, that, that the data actually has to be run through something else. 642 01:41:25.980 --> 01:41:33.630 Bill Katz: I assumed that because I'm not that familiar with all the different types of metadata. Is there something similar to like what we would do with 643 01:41:34.170 --> 01:41:55.350 Bill Katz: Source Code packaging where there's like a UI or something that corresponds to a particular you know like a unique identifier for an extension code base that that would have to be sort of, you know, a plugin that has to be accessible so that someone could actually get the data. 644 01:41:55.890 --> 01:42:07.680 Josh Moore: So at the compression level. Yes. So currently it's fairly hard coded in the V2 spec of czar on the next version of czar that everyone's working on 645 01:42:08.520 --> 01:42:19.410 Josh Moore: Does have a more formal. I think pearl based extension registry so you you know you you get a pearl URL that's the the source of your 646 01:42:20.100 --> 01:42:31.380 Josh Moore: Probably shared library or in Python, you know, just code that you can load and that does the the filtering, or the compression or you know whatever process, it is you want to apply to every chunk. 647 01:42:32.370 --> 01:42:44.250 Josh Moore: At the metadata level, at least in the JSON LD world you have the concept of a context. So you should be able to go to a URL online and load a definition of the metadata. 648 01:42:45.480 --> 01:42:53.160 Josh Moore: Whether or not you can interpret that representation basically depends on how smart your client is, you know, how much do you understand the vocabulary. 649 01:42:53.790 --> 01:42:59.340 Josh Moore: Arm so you know you could imagine if we stick to just the transforms 650 01:42:59.970 --> 01:43:11.040 Josh Moore: You know, if the I think DOM, you talked about a base transform class and I find transform subclass. And if there's a spline subclass that lives in a different context. 651 01:43:11.790 --> 01:43:19.590 Josh Moore: All the fields that are common amongst those you could interpret and maybe do something with them. But you'll need to have the concept of what's a required 652 01:43:20.400 --> 01:43:35.910 Josh Moore: You know, is this useful to me if I don't understand all the metadata. It gets it gets very tricky. Very quickly, as James is pointing out. So I don't know if we're going to be able to build clients, which can work with partial information. Not sure. 653 01:43:37.170 --> 01:43:47.400 Bill Katz: Is there any notion of sort of an extension perhaps being a service as opposed to a link to a piece of code that gets downloaded needs to be executed. 654 01:43:48.900 --> 01:43:49.110 Josh Moore: Only 655 01:43:50.790 --> 01:43:55.350 Bill Katz: There are lots of like landers and various other things. Now, which are available. 656 01:43:56.670 --> 01:44:07.080 Josh Moore: I don't know of anyone who's doing that, but that doesn't mean it's not possible. My, my knee jerk reaction is I have security concerns that sounds actually quite scary. 657 01:44:07.530 --> 01:44:18.600 Josh Moore: Um, but yeah i mean i the security issues can be handled with enough complexity with enough, you know, infrastructure. So that's something to think about. 658 01:44:21.000 --> 01:44:23.160 Josh Moore: Did anyone else have anything to kind of close up. 659 01:44:26.370 --> 01:44:27.450 Josh Moore: Okay, I'm gonna write down 660 01:44:29.550 --> 01:44:30.180 Like I said, 661 01:44:31.860 --> 01:44:34.080 Josh Moore: No URL as a service. 662 01:44:38.940 --> 01:44:40.140 Josh Moore: Okay, so 663 01:44:41.640 --> 01:44:43.320 Josh Moore: If we're kind of in finishing up mode. 664 01:44:45.660 --> 01:44:51.420 Josh Moore: There's certainly a sense that we should figure out next steps. I don't have them off. 665 01:44:52.320 --> 01:45:01.020 Josh Moore: The tip of my tongue. If anyone has anything that they're clear they either want to do this or they want to meet up with someone and have a conversation, feel free to say it now right into the document. 666 01:45:01.770 --> 01:45:10.200 Josh Moore: Otherwise, I will certainly write up a summary of some form and post it to image se or two that will me website. Um, and we keep the conversation going. 667 01:45:11.280 --> 01:45:13.440 Josh Moore: So does anyone have anything they want to do right away. 668 01:45:16.110 --> 01:45:16.740 Josh Moore: Thank you. 669 01:45:18.210 --> 01:45:31.530 Ola Tarkowska: So I have a proposal if it depends on how how extensive your notes are but because today. There were plenty of problem raised and quite a lot of 670 01:45:32.280 --> 01:45:42.510 Ola Tarkowska: Sick basically significant knowledge was shared and I feel there is quite a lot behind this as well. So I was wondering if people would be willing to write some 671 01:45:43.020 --> 01:45:55.440 Ola Tarkowska: Contribute to this specification. Like, for example, like gathering requirements like Moscow method is doing so provide what master would shoot 672 01:45:55.950 --> 01:46:08.310 Ola Tarkowska: What could be there or what you don't want to see based on your experience modalities various instruments you are using various analysis techniques and 673 01:46:09.810 --> 01:46:14.910 Ola Tarkowska: Is this something people would be interested to write down to contribute to 674 01:46:24.570 --> 01:46:27.660 Josh Moore: It from my side. Certainly, going back to I think where we started. 675 01:46:30.360 --> 01:46:32.700 Josh Moore: I don't have. I don't have the URL for you. 676 01:46:34.410 --> 01:46:43.710 Josh Moore: Welcome to take suggestions. But from our side. I think the output of everything we discussed needs to show up as issues. And I feel like with the two sessions we had 677 01:46:44.700 --> 01:46:56.250 Josh Moore: There's at least a good doesn't highlight, we need to think about. And then all I guess what you're proposing as a form of not voting on those but categorizing them in terms of 678 01:46:56.940 --> 01:46:58.410 Josh Moore: Priorities yeah 679 01:46:58.860 --> 01:47:16.650 Ola Tarkowska: Yeah, but this could help you have to decide what actually API should have and must have what's up show now and what shouldn't be there. Based on the use cases and experience because this was right. Since the beginning that we should operate on workflows. 680 01:47:19.530 --> 01:47:24.300 Ola Tarkowska: Well, it's up up to everyone. So it's just a suggestion. This could highlight 681 01:47:26.760 --> 01:47:28.680 Ola Tarkowska: Lots of good things. 682 01:47:32.760 --> 01:47:35.400 Josh Moore: Yeah, so I think something like that definitely needs to happen. I'm a bit 683 01:47:36.750 --> 01:47:43.710 Josh Moore: unclear what's the path forward to make it happen. I mean, we're all busy so that'll be the biggest trick I'm 684 01:47:45.570 --> 01:47:52.980 Josh Moore: Okay. So, kind of in that vein, so now I'm just going through the we have nine minutes. So the next step topics that I listed 685 01:47:55.230 --> 01:47:56.400 Josh Moore: I assume everyone's okay with 686 01:47:56.430 --> 01:47:59.160 Josh Moore: publishing the notes. They've been public 687 01:48:00.030 --> 01:48:02.460 Josh Moore: publishing the recordings. I'll get all that online. 688 01:48:04.740 --> 01:48:06.570 Josh Moore: Lots of thumbs up or thumbs down. 689 01:48:08.520 --> 01:48:12.030 Josh Moore: Is there a general feeling that it's worth doing this regularly. 690 01:48:13.170 --> 01:48:16.230 Josh Moore: So I asked this morning as well. So in roughly a month. 691 01:48:17.430 --> 01:48:22.260 Josh Moore: So I hope from all me side that will some more results will do another video or two. 692 01:48:23.970 --> 01:48:32.940 Josh Moore: I mean, I can just throw it out there and image se and whoever shows up shows up. So that's fine if there are particular discussions that someone wants to happen. 693 01:48:33.900 --> 01:48:43.410 Josh Moore: Not month period somehow say the word either now or in the document or an image se when we say, you know, when I opened up the registration for the next one. 694 01:48:44.820 --> 01:48:48.510 Josh Moore: Anyone's welcome to take control of this, you know, do what you need to do show something 695 01:48:50.520 --> 01:48:55.230 Josh Moore: Whatever it takes to really keep us moving forward. So basically moving forward like Ola. This is talking about. 696 01:48:56.400 --> 01:48:59.460 Josh Moore: That was the topics videos, um, 697 01:49:05.790 --> 01:49:14.820 Josh Moore: So from my side. I'll just say I I've experienced more and more that recording things helps with the communication, especially with all the time zones, we're dealing with 698 01:49:15.270 --> 01:49:19.170 Josh Moore: So that keeps me from needing to do the same presentation in the morning in the evening. 699 01:49:19.800 --> 01:49:29.970 Josh Moore: And I know there's a Nepali meeting which takes place for the Pacific in California and time zones. Um, I'll probably never make it to that. But, you know, 700 01:49:30.600 --> 01:49:36.570 Josh Moore: If I'm more than happy to send a video somewhere or you can send a video from some other time zone to this time zone. You know, it's kind of like 701 01:49:38.850 --> 01:49:48.090 Josh Moore: A TARDIS of some form of sharing information. Basically, we need new ways of working. So if anyone has ideas you know we're open to 702 01:49:48.900 --> 01:50:02.790 Josh Moore: Having everyone be involved, but having lots of meetings, obviously, is draining and time consuming. So what's the fine balance and what do you all want to see happen if you say everything should happen on GitHub. Then, you know, so be it. That's fine. 703 01:50:03.900 --> 01:50:07.920 Josh Moore: Last point for me. Um, is there any feedback on how 704 01:50:09.090 --> 01:50:16.410 Josh Moore: The setup of this meeting took place on image SC did it work for everyone, where the notifications annoying where they visible. 705 01:50:18.030 --> 01:50:19.170 Josh Moore: Would you change anything. 706 01:50:23.790 --> 01:50:29.850 Josh Moore: So the only so we did get the feedback this morning that having everyone say. Me too. Me too. Me too, me too. 707 01:50:30.630 --> 01:50:41.550 Josh Moore: Was kind of annoying. So I'll try to come up with something that prevents that for next time the top level session. And here's where I need a yes or no or actually, I'll just say I'm going to do this, but you're welcome to veto it. 708 01:50:42.600 --> 01:50:53.250 Josh Moore: Is to set up a group on image SC called N G, F, F, probably add all of you to it. And that way I don't have to add all of you to every thread, I can just add the group. 709 01:50:54.090 --> 01:51:04.620 Josh Moore: If you don't want to be in the group you're welcome to leave the group yourself or you could tell me some form that I shouldn't do it. But it's gonna make my life simpler, so I'm looking forward to that. 710 01:51:06.930 --> 01:51:07.560 Josh Moore: Cool. 711 01:51:08.580 --> 01:51:13.770 Josh Moore: Five minutes left floor is open. I can stick around for a bit. It's good to see everyone 712 01:51:18.750 --> 01:51:21.960 Josh Moore: Stay safe, healthy vote. I don't know. 713 01:51:24.120 --> 01:51:36.360 Jason Swedlow: Just I'll just, I'll just follow up. I mean, I put some comments in the chat. Clearly we're going forward with this we have we have funding to do this and you know if if solutions exist. Tell us. 714 01:51:38.580 --> 01:51:46.080 Jason Swedlow: You have specific requirements as we're discussing a few minutes ago around example metadata or whatever, say that as well. I mean, 715 01:51:48.150 --> 01:51:57.360 Jason Swedlow: It sounds like. I mean, the convergence shushes GitHub issues, but also these forums or me Jesse, you know, pick your flavor. 716 01:52:01.650 --> 01:52:02.040 Or 717 01:52:03.330 --> 01:52:07.560 Jason Swedlow: And it's lovely to see you guys and see you all see everybody reasonably healthy so 718 01:52:09.960 --> 01:52:13.110 Josh Moore: If someone adds a topic of a beer session for the next one, then we can 719 01:52:14.790 --> 01:52:15.600 Josh Moore: We can make that happen. 720 01:52:16.200 --> 01:52:17.550 Jason Swedlow: So it will become very popular. 721 01:52:19.140 --> 01:52:22.890 Josh Moore: It's an official work event right i mean you have to be allowed. 722 01:52:25.080 --> 01:52:31.620 Josh Moore: Cool. Then I mean, anyone who wants to disappear, feel free. I'm gonna hang out for a minute but. Take care everyone 723 01:52:33.150 --> 01:52:33.780 Mark Kittisopikul: Thank you. 724 01:52:34.830 --> 01:52:35.250 Ola Tarkowska: Hi. 725 01:52:37.800 --> 01:52:41.370 Eric Perlman: Josh, I don't know how you managed to host two of these. And one day. 726 01:52:44.700 --> 01:52:46.560 Eric Perlman: You have some superhumans guilds here. 727 01:52:47.490 --> 01:52:48.210 Josh Moore: I am 728 01:52:49.740 --> 01:53:01.680 Josh Moore: I don't think I really, I didn't prepare anything. So it's just kind of like, you know, I was kind of nervous, who is going to be absolute chaos, but everyone behaves so it's it's all fine, right, it could have been horrible, but it was it was lovely. 729 01:53:04.830 --> 01:53:09.720 Josh Moore: It's good talking to people who haven't been to a conference in February, good priests. 730 01:53:12.420 --> 01:53:13.380 Josh Moore: Are real conference. 731 01:53:14.400 --> 01:53:26.010 Ulrike Boehm: But also common to Jason like thank you for putting together this paper about formats and repositories to kick this off because also this is super crucial because I think we need also their 732 01:53:26.760 --> 01:53:35.820 Ulrike Boehm: guidelines and standards i don't know i i don't like to hear this word anymore. I have to say I used to so often already but but also with the repository part 733 01:53:36.210 --> 01:53:47.820 Ulrike Boehm: I really do have to say that although there are lots of repositories out there for image data. I really have to say they still also try to find themselves. That sounds a little bit esoteric right now but 734 01:53:48.810 --> 01:53:54.480 Ulrike Boehm: We had a talk at jamelia from someone who's also in charge of, I think, the image database. 735 01:53:55.320 --> 01:54:03.030 Ulrike Boehm: And she was only saying they still don't know how much data, they actually want to put in these repositories and so on and so forth. So 736 01:54:03.420 --> 01:54:11.160 Ulrike Boehm: If it's kind of only the data off the paper or everything. So this is also something where, where the entire community actually has to be 737 01:54:11.640 --> 01:54:18.630 Ulrike Boehm: Involved, because I think decisions are being made. But I think when decisions are made, I think the people who have the data, who produce the data. 738 01:54:19.230 --> 01:54:31.440 Ulrike Boehm: Should be part of these discussions and I mean I'm just which kind of supernatural right but I think sometimes people are over ambitious to get things done and then they sometimes might miss important 739 01:54:31.440 --> 01:54:33.510 Ulrike Boehm: Important. And just as a 740 01:54:36.810 --> 01:54:37.140 Jason Swedlow: Yes. 741 01:54:38.160 --> 01:54:42.420 Jason Swedlow: Yes. Oh, I'll all correct and 742 01:54:45.210 --> 01:54:45.570 Ulrike Boehm: Yeah. 743 01:54:45.990 --> 01:54:59.820 Jason Swedlow: Yeah, that that whole side of things, the data repository data resource side of things is probably it's just probably the shortest thing to say is it's early days, and we're going to stumble for a while and 744 01:55:01.260 --> 01:55:12.000 Jason Swedlow: I had right before locked out. I had the pleasure of having a drink with Helen Berman, who's been doing PDP for 50 years and, you know, she said, Yeah, you know, 745 01:55:12.210 --> 01:55:13.740 Josh Moore: Should I, should I turn off the recording. 746 01:55:14.640 --> 01:55:14.970 Probably 747 01:55:16.410 --> 01:55:17.220 Josh Moore: Turning off the recording. 748 01:55:20.820 --> 01:55:22.170 Jason Swedlow: Or live was yeah you know